5Hugging Face Blog·1mo ago

Granite 4.0 Nano: Just how small can you go?

IBM has released Granite 4.0 Nano, a small-footprint language model in the Granite 4.0 family, published via the Hugging Face blog. The post explores the capabilities and trade-offs of pushing model size to its lower limits while maintaining practical utility. This release is part of IBM's ongoing effort to develop efficient, enterprise-deployable AI models under the Granite brand.

Open Weights Progress Inference Economics Enterprise Deployment Patterns Granite 4.0 IBM Granite 4.0 Nano Hugging Face

Related guides (4)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner In-depth

Enterprise Deployment PatternsTopic guide

Enterprise Deployment Patterns: From LLM Demo to Production Reality

Read asIn-depth

Inference EconomicsTopic guide

Inference Economics: The Cost Structure of Running AI Models in Production

Read asIn-depth

Related events (8)

5Hugging Face Blog·1mo ago·source ↗

Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents

IBM released Granite 4.0 3B Vision, a compact multimodal model targeting enterprise document understanding tasks. The model is hosted on Hugging Face and positioned for deployment in resource-constrained enterprise environments. As a 3B-parameter vision-language model, it competes in the small-but-capable segment increasingly favored for on-premise and edge deployments.

Open Weights Progress Enterprise Deployment Patterns IBM Hugging Face Granite 4.0 3B Vision +1 more

5Hugging Face Blog·1mo ago·source ↗

Granite 4.1 LLMs: How They're Built

IBM has published a blog post on Hugging Face detailing the construction of its Granite 4.1 language models. The post covers architectural and training decisions behind the new model family. As a tier-2 source with default commentary depth, this provides insight into IBM's continued investment in open enterprise LLMs but lacks the full technical depth of a primary research paper.

Open Weights Progress Enterprise Deployment Patterns IBM Hugging Face Granite 4.1

8Openai Blog·1mo ago·source ↗

Introducing GPT-5.4 mini and nano

OpenAI has released GPT-5.4 mini and nano, smaller and faster variants of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads. These models are positioned for efficiency-sensitive deployment scenarios including agentic pipelines. The release extends the GPT-5.4 family with tiered model options targeting different cost and latency tradeoffs.

Frontier Model Releases Inference Economics OpenAI GPT-5.4 mini GPT-5.4 nano +3 more

6Google Deepmind Blog·1mo ago·source ↗

Nano Banana 2: Combining Pro capabilities with lightning-fast speed

DeepMind has announced Nano Banana 2, a new image generation model described as combining Pro-level capabilities with Flash-level inference speed. The model is positioned as production-ready, featuring advanced world knowledge, subject consistency, and fast generation. The announcement appears to target developers and enterprise users seeking high-quality image generation at lower latency.

Frontier Model Releases Inference Economics Google DeepMind Nano Banana 2 +1 more

5Google Deepmind Blog·1mo ago·source ↗

Introducing Gemma 3 270M: The compact model for hyper-efficient AI

Google DeepMind has released Gemma 3 270M, a 270-million parameter compact language model added to the Gemma 3 family. The model is positioned as a highly specialized, hyper-efficient tool for resource-constrained deployments. This extends the Gemma 3 lineup into the sub-billion parameter range, targeting edge and on-device use cases.

Open Weights Progress Inference Economics Gemma 3 Google DeepMind Gemma 3 270M +1 more

5Hugging Face Blog·1mo ago·source ↗

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context

IBM released Granite Embedding Multilingual R2, an open-weights (Apache 2.0) multilingual embedding model with 32K context window, claiming best-in-class retrieval quality among sub-100M parameter models. The model is positioned for enterprise RAG and retrieval use cases across multiple languages. It is hosted and announced via Hugging Face.

Long Context Evolution Open Weights Progress Granite Embedding Multilingual R2 IBM Apache 2.0 +2 more

6The Batch·18d ago·source ↗

Google launches Gemini 3.1 Flash Image (Nano Banana 2), faster and cheaper image generation

Google released Gemini 3.1 Flash Image (internally codenamed Nano Banana 2), a successor to Nano Banana Pro that is approximately four times faster and half the cost per image. The system is built on a mixture-of-experts transformer based on Gemini 3 Flash and supports up to 4096x4096 resolution, multilingual text rendering, and character consistency across images. It leads the Arena.ai text-to-image leaderboard by human preference (1,280 Elo) and competes closely with OpenAI's GPT Image 1.5 across multiple leaderboards, positioning Google competitively in the rapidly escalating image generation market.

Frontier Model Releases Inference Economics GPT-Image-1.5 Google SynthID +7 more

7The Batch·20d ago·source ↗

Data Points: China Blocks Meta-Manus Deal; Microsoft-OpenAI Restructure; Nvidia Nemotron Omni; Grok 4.3; OpenAI AGI Principles; IBM Granite 4.1

A roundup of major AI developments: Chinese regulators blocked Meta's acquisition of Singapore-based agent startup Manus on security grounds; Microsoft and OpenAI restructured their partnership, with OpenAI gaining freedom to sell on rival clouds while Microsoft loses its AGI-access clause; Nvidia released Nemotron 3 Nano Omni, a 30B MoE omnimodal open-weights model for local agent deployment; xAI shipped Grok 4.3 with a 1M-token context window at reduced pricing; OpenAI published AGI operating principles; and IBM released Granite 4.1 across language, vision, speech, embedding, and safety modalities.

Long Context Evolution Frontier Model Releases Palantir IBM Microsoft +17 more