7The Batch (DeepLearning.AI)·4d ago

The Batch: Jalapeño inference chip, Fugu multi-agent system, Claude Tag, Robin bio-agent, and Getty-OpenAI deal

OpenAI and Broadcom announced Jalapeño, OpenAI's first custom inference chip, designed in nine months with AI-assisted design and showing better performance-per-watt than current accelerators; engineering samples are already running GPT-5.3-Codex-Spark with datacenter deployment planned by end of 2026. Sakana AI released Fugu, a multi-agent routing system that scored 73.7% on SWE-Bench Pro, outperforming Claude Opus 4.8 and GPT-5.5 while remaining below the inaccessible Fable 5. Additional items cover Anthropic's Claude Tag Slack integration for async team collaboration, Seedance 2.5 video model improvements, the Robin autonomous biology research agent that identified a novel drug candidate, and a Getty Images licensing partnership with OpenAI.

Related guides (4)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Anthropic

Anthropic: Frontier AI Lab at the Intersection of Capability and Safety Governance

Read asIn-depth

Training InfrastructureTopic guide

Training Infrastructure: The Compute Arms Race Powering Modern AI

Read asBeginner In-depth

GPT-5.5

GPT-5.5: OpenAI's Most Capable Model — and Its Most Complicated

Read asBeginner

Related events (8)

6The Batch·28d ago·source ↗

Data Points: Nvidia Ising Models for Quantum Computing, Meta Muse Spark, GitHub Rubber Duck, Anthropic Claude Managed Agents, GPT-5.4-Cyber

Nvidia released Ising, a family of open AI models targeting quantum processor calibration and error correction, achieving 2.5x faster and 3x more accurate decoding than pyMatching, with adoption by Fermilab, Harvard, and others. Meta announced Muse Spark, a small multimodal model powering a new AI assistant series for its apps and glasses. GitHub introduced Rubber Duck, a cross-model review feature pairing Claude with GPT-5.4 for two-pass coding agent validation. Anthropic launched Claude Managed Agents, a managed infrastructure platform for enterprise autonomous AI deployment, while OpenAI expanded its Trusted Access for Cyber program with GPT-5.4-Cyber, a fine-tuned defensive cybersecurity model.

Frontier Model Releases Inference Economics Rubber Duck Notion GPT-5.5-Cyber +22 more

8Openai Blog·5d ago·source ↗

OpenAI and Broadcom unveil Jalapeño, a custom LLM inference chip

OpenAI and Broadcom have jointly announced Jalapeño, a custom silicon chip designed specifically for LLM inference workloads. The chip targets improvements in performance, efficiency, and scale across AI systems. This marks OpenAI's entry into custom inference silicon, reducing dependence on third-party GPU suppliers for inference at scale.

Training Infrastructure Inference Economics Broadcom OpenAI Jalapeño

6The Batch·7d ago·source ↗

The Batch digest: U.S. chatbot adoption tops 50%, AA-Briefcase benchmark, ARD spec, North Mini Code, Fable/Mythos export controls

A weekly digest from DeepLearning.AI covers five AI developments: a Pew Research Center survey showing nearly half of U.S. adults now use AI chatbots (ChatGPT at 44% adoption); Artificial Analysis releasing AA-Briefcase, a new benchmark for complex knowledge-work tasks where Claude Opus 4.8 is a top performer; Hugging Face publishing a reference implementation of the Agentic Resource Discovery (ARD) open spec co-developed with Microsoft, Google, and others for runtime tool discovery by agents; Cohere releasing North Mini Code, a 30B-parameter open-weight MoE coding model under Apache 2.0; and over 100 cybersecurity professionals signing an open letter urging the U.S. government to reverse export controls on Anthropic's Claude Fable 5 and Claude Mythos 5. The ARD and export-control items are the highest-signal stories, touching agent infrastructure standards and AI regulatory policy respectively.

Evaluation and Benchmarking Open Weights Progress Artificial Analysis DeepLearning.AI Claude Mythos +22 more

7The Batch·26d ago·source ↗

Data Points: GPT-5.4 Pro, Luma Uni-1, Phi-4-reasoning-vision-15B, Yuan 3.0 Ultra, OpenAI hardware chief resignation

The Batch's weekly roundup covers several significant AI developments: OpenAI released GPT-5.4 and GPT-5.4 Pro with computer-use agent capabilities, 1M token context, and strong benchmark gains on GDPval and OSWorld-Verified; Luma AI released Uni-1, a unified autoregressive model for visual understanding and generation; Microsoft released Phi-4-reasoning-vision-15B, an open-weights multimodal model trained on 200B tokens; Yuan Lab AI released Yuan 3.0 Ultra, a 1T-parameter MoE model with SOTA on document retrieval benchmarks. Additionally, OpenAI hardware chief Caitlin Kalinowski resigned over the company's Pentagon deal, citing concerns about surveillance and autonomous weapons governance.

Frontier Model Releases Open Weights Progress Black Forest Labs Layer-Adaptive Expert Pruning Caitlin Kalinowski +19 more

6The Batch·26d ago·source ↗

Data Points: NemoClaw enterprise stack, GPT-5.4 mini/nano, Nemotron 3 Nano 4B, Midjourney V8, and Mamba-3

A multi-item roundup covers several AI developments: Nvidia unveiled NemoClaw at GTC 2026, an enterprise software stack integrating with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. OpenAI released GPT-5.4 mini and nano, smaller variants optimized for speed with benchmark results on SWE-Bench Pro and OSWorld-Verified, priced at $0.75 and $0.20 per million input tokens respectively. Nvidia also released Nemotron 3 Nano 4B, a hybrid Mamba-Transformer 4B parameter on-device model. Additional items cover Midjourney V8 alpha (5x faster, diffusion-only) and Mamba-3, a 1.5B state space model from CMU and Together.AI with improved accuracy over Mamba-2.

Frontier Model Releases Inference Economics Midjourney Mamba Carnegie Mellon University +19 more

7The Batch·19d ago·source ↗

The Batch: Claude Mythos 5 / Fable 5 debut, Apple AFM 3, Google Live Translate, OpenAI IPO filing, FrontierCode benchmark

Anthropic launched Claude Fable 5 (a safety-guardrailed model) and Claude Mythos 5 (same underlying model with safeguards removed, for vetted cyberdefense/infrastructure users via Project Glasswing with US government collaboration), both priced at $10/$50 per million tokens. Apple released five new Apple Foundation Models (AFM 3) spanning on-device and cloud tiers, built with Google and Nvidia infrastructure. Additional headlines cover Google's Gemini 3.5 Live Translate (70+ languages, real-time), OpenAI's confidential SEC IPO filing, a NotebookLM upgrade to Gemini 3.5, and Cognition's FrontierCode benchmark for code-quality evaluation where Claude Opus 4.8 leads at 34.3%.

Frontier Model Releases Evaluation and Benchmarking Claude Mythos Claude Opus 4.6 Google +19 more

7The Batch·28d ago·source ↗

Data Points: OpenAI and Microsoft sever their exclusive relationship

This edition of The Batch covers several major AI industry developments: OpenAI has revised its partnership with Microsoft, ending exclusivity while retaining Microsoft as primary cloud partner through 2032 and gaining freedom to deploy on AWS and Google Cloud. DeepSeek released V4 model weights featuring 1M-token context and Huawei Ascend chip optimization, though it trails leading open and closed models on aggregate benchmarks. Google and Amazon are deepening investments in Anthropic with up to $40B and $25B respectively in funding-for-compute deals, and an agentic AI system autonomously designed a functional RISC-V CPU from a 219-word spec in 12 hours.

Training Infrastructure Frontier Model Releases Google Cloud Google TPU knowledge distillation +25 more

6The Batch·26d ago·source ↗

Data Points: Perplexity Computer expands, Google Aletheia math agent, DeepSeek chip strategy, Nvidia retrieval pipeline, Stargate cancellation

The Batch's weekly data points roundup covers five significant AI developments: Perplexity expanded its Computer agentic platform to desktop, mobile, and enterprise with new APIs and financial data tools; Google released Aletheia, a Gemini-based math research agent achieving 95.1% on IMO-Proof Bench Advanced (up from 65.7%); DeepSeek withheld pre-release access to its V4 model from Nvidia and AMD while giving domestic Chinese chipmakers early access; Nvidia's NeMo Retriever topped the ViDoRe v3 leaderboard using a ReACT-based agentic retrieval loop; and OpenAI and Oracle cancelled plans to expand the Abilene Stargate campus from 1.2 GW to 2.0 GW due to financing and reliability issues.

Training Infrastructure Frontier Model Releases ViDoRe v3 Crusoe BRIGHT +19 more