7The Batch (DeepLearning.AI)·35h ago

Nvidia Nemotron 3 Ultra: hybrid Mamba-transformer open-weights model targeting agentic workloads

Nvidia released Nemotron 3 Ultra, a 550B parameter (55B active) hybrid Mamba-transformer mixture-of-experts model with a 1M token context window, publishing weights, training data, and RL environments under an open license. The model ranks as the highest-scoring U.S. open-weights model on the Artificial Analysis Intelligence Index (47.7-48.2) and is approximately three times faster than comparable open-weights rivals, though it trails leading Chinese models like Kimi K2.6 and DeepSeek V4 Pro on intelligence benchmarks. Nvidia used a novel Multi-Teacher On-Policy Distillation approach with 10+ specialized teacher models and trained using NVFP4 quantization. The release is strategically motivated by Nvidia's interest in a healthy open-weights ecosystem that drives AI semiconductor adoption.

Related guides (3)

MambaConcept

Mamba: The Attention-Free Architecture That Scales Without Slowing Down

Read asBeginner In-depth

NVIDIA

NVIDIA: The Hardware Backbone of the AI Era

Read asBeginner In-depth

DeepSeek V4

DeepSeek V4: The Open-Weights Giant Reshaping AI Economics

Read asBeginner In-depth

Related events (8)

7The Batch·18d ago·source ↗

Nvidia releases Nemotron 3 Super 120B-A12B open-weights model with hybrid Mamba-2/MoE architecture

Nvidia released Nemotron 3 Super 120B-A12B, an open-weights LLM with a hybrid Mamba-2/transformer/MoE architecture that activates only 12B parameters per token and supports up to 1 million token context. The model claims the fastest inference speed in its size class at 442 tokens/second and leads open-weights models on PinchBench agentic task evaluation, outperforming larger models including Kimi K2.5 (1T parameters). Nvidia is releasing weights, training data, and recipes under a permissive commercial license, and plans a $26B five-year investment in open-weights models — framed partly as a strategic response to Chinese labs building capable open-weights models on non-Nvidia hardware.

Frontier Model Releases Open Weights Progress Nemotron 3 Super 120B-A12B Nemotron 3 Ultra-500B-A50B PivotRL +18 more

5Hugging Face Blog·1mo ago·source ↗

Measuring Open-Source Llama Nemotron Models on DeepResearch Bench

NVIDIA evaluates its open-source Llama Nemotron models on the DeepResearch Bench, a benchmark designed to assess deep research agent capabilities. The post appears to report competitive performance of the Nemotron models in agentic research tasks. This is relevant to the ongoing development of open-weights models capable of multi-step research and reasoning workflows.

Evaluation and Benchmarking Open Weights Progress Llama Nemotron NVIDIA DeepResearch Bench +3 more

6The Batch·17d ago·source ↗

Data Points: NemoClaw enterprise stack, GPT-5.4 mini/nano, Nemotron 3 Nano 4B, Midjourney V8, and Mamba-3

A multi-item roundup covers several AI developments: Nvidia unveiled NemoClaw at GTC 2026, an enterprise software stack integrating with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. OpenAI released GPT-5.4 mini and nano, smaller variants optimized for speed with benchmark results on SWE-Bench Pro and OSWorld-Verified, priced at $0.75 and $0.20 per million input tokens respectively. Nvidia also released Nemotron 3 Nano 4B, a hybrid Mamba-Transformer 4B parameter on-device model. Additional items cover Midjourney V8 alpha (5x faster, diffusion-only) and Mamba-3, a 1.5B state space model from CMU and Together.AI with improved accuracy over Mamba-2.

Frontier Model Releases Inference Economics Midjourney Mamba Carnegie Mellon University +19 more

6The Batch·18d ago·source ↗

The Batch Issue 346: Nvidia Nemotron Super 120B, OpenAI-Amazon Deal, Regulatory Commentary

The Batch's weekly digest covers Nvidia's release of Nemotron 3 Super 120B-A12B, an open-weights hybrid mamba-2/transformer/MoE model with 1M token context trained on 25 trillion tokens, positioned as a speed leader in its size class for agentic applications. The issue also touches on OpenAI's Amazon deal and Grok video pricing cuts. Editor Andrew Ng's letter addresses the White House's proposed federal AI preemption framework and critiques what he characterizes as coordinated anti-AI messaging campaigns. Multiple significant industry developments are bundled in a single newsletter digest.

Frontier Model Releases Open Weights Progress Nemotron 3 Super 120B-A12B Nemotron 3 Ultra-500B-A50B DeepLearning.AI +9 more

7Mistral Ai News·19d ago·source ↗

Mistral NeMo: 12B Open-Weights Model with 128k Context, Built with NVIDIA

Mistral AI and NVIDIA jointly release Mistral NeMo, a 12B parameter model under Apache 2.0 license featuring a 128k token context window and a new tokenizer called Tekken based on Tiktoken. The model is designed as a drop-in replacement for Mistral 7B, supports multilingual applications across 11+ languages, and was trained with quantization awareness enabling FP8 inference without performance loss. Benchmark comparisons show competitive performance against Gemma 2 9B and Llama 3 8B. Weights are available on HuggingFace and the model is also packaged as an NVIDIA NIM inference microservice.

Long Context Evolution Frontier Model Releases Mistral AI Gemma 2 9B Apache 2.0 +9 more

6The Batch·19d ago·source ↗

Kimi K2.6: Moonshot AI's 1T-Parameter Vision-Language Model Matches Open-Weights Peers, Trails Top Closed Models

Moonshot AI released Kimi K2.6, a 1 trillion-parameter mixture-of-experts vision-language model with 32B active parameters, designed for long-horizon autonomous coding sessions lasting multiple days and multi-agent orchestration scaling to 300 parallel subagents executing up to 4,000 steps. The model matches Qwen3.6 Max Preview and DeepSeek-V4-Pro on the Artificial Analysis Intelligence Index (scoring 54 vs. their 52) while trailing closed models like GPT-5.5 and Claude Opus 4.7. Weights are freely downloadable from Hugging Face under a modified MIT license permitting commercial use, with API access priced at $0.95/$0.16/$4.00 per million input/cached/output tokens. Notable features include a 256K token context window, native INT4 quantization, a 'preserve thinking' mode for multi-turn reasoning continuity, and a research preview 'claw groups' feature enabling cross-developer agent collaboration.

Frontier Model Releases Evaluation and Benchmarking Artificial Analysis Intelligence Index Claude Opus 4.6 Qwen3.6 Max Preview +14 more

6Hugging Face Blog·1mo ago·source ↗

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

NVIDIA has released Nemotron 3 Nano Omni, a multimodal model targeting long-context understanding across documents, audio, and video modalities. The model is positioned for agentic use cases requiring cross-modal reasoning. It is published via the Hugging Face blog as part of NVIDIA's Nemotron model family. No detailed technical specifications or benchmark results are provided in the available body text.

Long Context Evolution Open Weights Progress Nemotron 3 Nano Omni NVIDIA Hugging Face +3 more

5Hugging Face Blog·16d ago·source ↗

NVIDIA releases Nemotron 3.5 Content Safety, a customizable multimodal safety model for enterprise AI

NVIDIA has released Nemotron 3.5 Content Safety, a multimodal safety model designed for enterprise AI deployments with customization capabilities for global use cases. The model is announced via the Hugging Face blog, targeting content moderation and safety classification across modalities. This is relevant to the growing enterprise demand for controllable, deployable safety layers on top of foundation models.

AI Safety Research Enterprise Deployment Patterns Nemotron 3.5 Content Safety NVIDIA Hugging Face +1 more