6Google DeepMind Blog·1mo ago

Nano Banana 2: Combining Pro capabilities with lightning-fast speed

DeepMind has announced Nano Banana 2, a new image generation model described as combining Pro-level capabilities with Flash-level inference speed. The model is positioned as production-ready, featuring advanced world knowledge, subject consistency, and fast generation. The announcement appears to target developers and enterprise users seeking high-quality image generation at lower latency.

Frontier Model Releases Inference Economics Multimodal Progress Google DeepMind Nano Banana 2

Related guides (4)

Google DeepMind

Google DeepMind: Frontier AI Across Models, Robotics, and Scientific Discovery

Read asIn-depth

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Multimodal ProgressTopic guide

Multimodal Progress: How AI Learned to See, Hear, and Act

Read asBeginner In-depth

Inference EconomicsTopic guide

Inference Economics: The Cost Structure of Running AI Models in Production

Read asIn-depth

Related events (8)

6The Batch·17d ago·source ↗

Google launches Gemini 3.1 Flash Image (Nano Banana 2), faster and cheaper image generation

Google released Gemini 3.1 Flash Image (internally codenamed Nano Banana 2), a successor to Nano Banana Pro that is approximately four times faster and half the cost per image. The system is built on a mixture-of-experts transformer based on Gemini 3 Flash and supports up to 4096x4096 resolution, multilingual text rendering, and character consistency across images. It leads the Arena.ai text-to-image leaderboard by human preference (1,280 Elo) and competes closely with OpenAI's GPT Image 1.5 across multiple leaderboards, positioning Google competitively in the rapidly escalating image generation market.

Frontier Model Releases Inference Economics GPT-Image-1.5 Google SynthID +7 more

5The Batch·17d ago·source ↗

DeepLearning.AI launches Context Hub for coding agents; Google releases Nano Banana 2 image generator

Andrew Ng and collaborators released Context Hub (chub), an open CLI tool that provides coding agents with up-to-date API documentation to reduce hallucinated or outdated API calls. Google separately launched Nano Banana 2 (Gemini 3.1 Flash Image), a faster and cheaper image-generation system built on Gemini 3 Flash's mixture-of-experts architecture, priced at roughly half its predecessor and claiming the top spot on Arena.ai's text-to-image leaderboard. The newsletter also references Claude Opus 4.6 as a leading coding model and notes the growth of agent-to-agent social infrastructure (OpenClaw, Moltbook) as context for the tooling need.

Inference Economics Agent and Tool Ecosystem DeepLearning.AI GPT-Image-1.5 Claude Opus 4.6 +8 more

8Openai Blog·1mo ago·source ↗

Introducing GPT-5.4 mini and nano

OpenAI has released GPT-5.4 mini and nano, smaller and faster variants of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads. These models are positioned for efficiency-sensitive deployment scenarios including agentic pipelines. The release extends the GPT-5.4 family with tiered model options targeting different cost and latency tradeoffs.

Frontier Model Releases Inference Economics OpenAI GPT-5.4 mini GPT-5.4 nano +3 more

5Hugging Face Blog·1mo ago·source ↗

Granite 4.0 Nano: Just how small can you go?

IBM has released Granite 4.0 Nano, a small-footprint language model in the Granite 4.0 family, published via the Hugging Face blog. The post explores the capabilities and trade-offs of pushing model size to its lower limits while maintaining practical utility. This release is part of IBM's ongoing effort to develop efficient, enterprise-deployable AI models under the Granite brand.

Open Weights Progress Inference Economics Granite 4.0 IBM Granite 4.0 Nano +2 more

7Google Deepmind Blog·1mo ago·source ↗

Gemini 2.0 Flash Native Image Generation Now Available for Developers

Google DeepMind has released native image output capability in Gemini 2.0 Flash, making it available to developers via Google AI Studio and the Gemini API. This enables the model to generate images natively rather than through a separate image generation pipeline. The release is framed as an experimental feature for developer exploration.

Frontier Model Releases Agent and Tool Ecosystem Google AI Studio Gemini-2.5-Flash-Lite Google DeepMind +2 more

6Hugging Face Blog·1mo ago·source ↗

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

NVIDIA has released Nemotron 3 Nano Omni, a multimodal model targeting long-context understanding across documents, audio, and video modalities. The model is positioned for agentic use cases requiring cross-modal reasoning. It is published via the Hugging Face blog as part of NVIDIA's Nemotron model family. No detailed technical specifications or benchmark results are provided in the available body text.

Long Context Evolution Open Weights Progress Nemotron 3 Nano Omni NVIDIA Hugging Face +3 more

6The Batch·17d ago·source ↗

Data Points: NemoClaw enterprise stack, GPT-5.4 mini/nano, Nemotron 3 Nano 4B, Midjourney V8, and Mamba-3

A multi-item roundup covers several AI developments: Nvidia unveiled NemoClaw at GTC 2026, an enterprise software stack integrating with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. OpenAI released GPT-5.4 mini and nano, smaller variants optimized for speed with benchmark results on SWE-Bench Pro and OSWorld-Verified, priced at $0.75 and $0.20 per million input tokens respectively. Nvidia also released Nemotron 3 Nano 4B, a hybrid Mamba-Transformer 4B parameter on-device model. Additional items cover Midjourney V8 alpha (5x faster, diffusion-only) and Mamba-3, a 1.5B state space model from CMU and Together.AI with improved accuracy over Mamba-2.

Frontier Model Releases Inference Economics Midjourney Mamba Carnegie Mellon University +19 more

5Github Trending·1mo ago·source ↗

NVlabs/Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

NVIDIA Labs has released Sana, an open-source image synthesis system using a Linear Diffusion Transformer architecture designed for efficient high-resolution image generation. The repository has accumulated 6,261 stars with 472 added in a single day, indicating strong community interest. The project targets improved computational efficiency in diffusion-based image synthesis, a key challenge for scaling to higher resolutions.

Inference Economics Multimodal Progress NVIDIA Labs Linear Diffusion Transformer Sana