Almanac
← Events
6The Batch (DeepLearning.AI)·18d ago

The Batch Issue 346: Nvidia Nemotron Super 120B, OpenAI-Amazon Deal, Regulatory Commentary

The Batch's weekly digest covers Nvidia's release of Nemotron 3 Super 120B-A12B, an open-weights hybrid mamba-2/transformer/MoE model with 1M token context trained on 25 trillion tokens, positioned as a speed leader in its size class for agentic applications. The issue also touches on OpenAI's Amazon deal and Grok video pricing cuts. Editor Andrew Ng's letter addresses the White House's proposed federal AI preemption framework and critiques what he characterizes as coordinated anti-AI messaging campaigns. Multiple significant industry developments are bundled in a single newsletter digest.

Related guides (3)

Related events (8)

7The Batch·18d ago·source ↗

Nvidia releases Nemotron 3 Super 120B-A12B open-weights model with hybrid Mamba-2/MoE architecture

Nvidia released Nemotron 3 Super 120B-A12B, an open-weights LLM with a hybrid Mamba-2/transformer/MoE architecture that activates only 12B parameters per token and supports up to 1 million token context. The model claims the fastest inference speed in its size class at 442 tokens/second and leads open-weights models on PinchBench agentic task evaluation, outperforming larger models including Kimi K2.5 (1T parameters). Nvidia is releasing weights, training data, and recipes under a permissive commercial license, and plans a $26B five-year investment in open-weights models — framed partly as a strategic response to Chinese labs building capable open-weights models on non-Nvidia hardware.

6The Batch·15d ago·source ↗

The Batch Issue 356: Qwen3.7-Max release, White House AI executive order, fine-tuning breaks copyright alignment

The Batch issue 356 covers several distinct AI developments: Alibaba's release of Qwen3.7-Max, a closed-weights flagship LLM targeting agentic coding and scientific tasks with a novel RL training approach that decouples task, harness, and verifier; a new White House executive order on frontier AI models focused on cybersecurity, including voluntary model-sharing with government; and a finding that fine-tuning breaks copyright alignment in LLMs. Andrew Ng's editorial commentary frames the executive order as a reasonable compromise, noting Anthropic's Mythos vulnerability-detection model as a key driver of the cybersecurity concerns behind the regulation.

7The Batch·19d ago·source ↗

Data Points: OpenAI and Microsoft sever their exclusive relationship

This edition of The Batch covers several major AI industry developments: OpenAI has revised its partnership with Microsoft, ending exclusivity while retaining Microsoft as primary cloud partner through 2032 and gaining freedom to deploy on AWS and Google Cloud. DeepSeek released V4 model weights featuring 1M-token context and Huawei Ascend chip optimization, though it trails leading open and closed models on aggregate benchmarks. Google and Amazon are deepening investments in Anthropic with up to $40B and $25B respectively in funding-for-compute deals, and an agentic AI system autonomously designed a functional RISC-V CPU from a 219-word spec in 12 hours.

7The Batch·19d ago·source ↗

Data Points: Qwen3.7-Max, OpenAI Math Proof, Gated DeltaNet-2, Trump AI Order, Microsoft Fara1.5

This edition of The Batch covers five significant AI developments: Alibaba's Qwen3.7-Max reasoning model with 1M token context and agentic capabilities ranking fifth on the Artificial Analysis Intelligence Index; an OpenAI reasoning model resolving the 80-year-old Erdős planar unit distance problem; Nvidia's Gated DeltaNet-2 outperforming Mamba-3 and other linear attention architectures; Trump pulling back a proposed AI regulation executive order; and Microsoft Research's Fara1.5 computer-use agent family beating OpenAI Operator and Google Gemini on the Online-Mind2Web benchmark.

7The Batch·19d ago·source ↗

Data Points: China Blocks Meta-Manus Deal; Microsoft-OpenAI Restructure; Nvidia Nemotron Omni; Grok 4.3; OpenAI AGI Principles; IBM Granite 4.1

A roundup of major AI developments: Chinese regulators blocked Meta's acquisition of Singapore-based agent startup Manus on security grounds; Microsoft and OpenAI restructured their partnership, with OpenAI gaining freedom to sell on rival clouds while Microsoft loses its AGI-access clause; Nvidia released Nemotron 3 Nano Omni, a 30B MoE omnimodal open-weights model for local agent deployment; xAI shipped Grok 4.3 with a 1M-token context window at reduced pricing; OpenAI published AGI operating principles; and IBM released Granite 4.1 across language, vision, speech, embedding, and safety modalities.

6The Batch·17d ago·source ↗

Data Points: NemoClaw enterprise stack, GPT-5.4 mini/nano, Nemotron 3 Nano 4B, Midjourney V8, and Mamba-3

A multi-item roundup covers several AI developments: Nvidia unveiled NemoClaw at GTC 2026, an enterprise software stack integrating with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. OpenAI released GPT-5.4 mini and nano, smaller variants optimized for speed with benchmark results on SWE-Bench Pro and OSWorld-Verified, priced at $0.75 and $0.20 per million input tokens respectively. Nvidia also released Nemotron 3 Nano 4B, a hybrid Mamba-Transformer 4B parameter on-device model. Additional items cover Midjourney V8 alpha (5x faster, diffusion-only) and Mamba-3, a 1.5B state space model from CMU and Together.AI with improved accuracy over Mamba-2.

7The Batch·35h ago·source ↗

Nvidia Nemotron 3 Ultra: hybrid Mamba-transformer open-weights model targeting agentic workloads

Nvidia released Nemotron 3 Ultra, a 550B parameter (55B active) hybrid Mamba-transformer mixture-of-experts model with a 1M token context window, publishing weights, training data, and RL environments under an open license. The model ranks as the highest-scoring U.S. open-weights model on the Artificial Analysis Intelligence Index (47.7-48.2) and is approximately three times faster than comparable open-weights rivals, though it trails leading Chinese models like Kimi K2.6 and DeepSeek V4 Pro on intelligence benchmarks. Nvidia used a novel Multi-Teacher On-Policy Distillation approach with 10+ specialized teacher models and trained using NVFP4 quantization. The release is strategically motivated by Nvidia's interest in a healthy open-weights ecosystem that drives AI semiconductor adoption.

6The Batch·18d ago·source ↗

The Batch Issue 345: Iranian Drone Attacks on AWS Data Centers, Qwen3.5, DeepSeek-Huawei, and AI Job Insecurity

Andrew Ng's weekly newsletter covers several significant AI-adjacent developments: Iranian drones struck at least three Amazon Web Services data centers in Bahrain and the UAE, disrupting cloud services and raising concerns given U.S. military use of AWS to run Anthropic Claude; the issue also previews Qwen3.5 model releases across multiple sizes and DeepSeek's reported moves involving Huawei hardware. Ng also addresses widespread job insecurity across skill levels amid rapid AI advancement, citing geopolitical risks including the Iran war, Taiwan uncertainty, and rare-earth metal supply chains as compounding factors.