Almanac
← Events
6The Batch (DeepLearning.AI)·17d ago

Data Points: Perplexity Computer expands, Google Aletheia math agent, DeepSeek chip strategy, Nvidia retrieval pipeline, Stargate cancellation

The Batch's weekly data points roundup covers five significant AI developments: Perplexity expanded its Computer agentic platform to desktop, mobile, and enterprise with new APIs and financial data tools; Google released Aletheia, a Gemini-based math research agent achieving 95.1% on IMO-Proof Bench Advanced (up from 65.7%); DeepSeek withheld pre-release access to its V4 model from Nvidia and AMD while giving domestic Chinese chipmakers early access; Nvidia's NeMo Retriever topped the ViDoRe v3 leaderboard using a ReACT-based agentic retrieval loop; and OpenAI and Oracle cancelled plans to expand the Abilene Stargate campus from 1.2 GW to 2.0 GW due to financing and reliability issues.

Related guides (5)

Related events (8)

7The Batch·19d ago·source ↗

Data Points: Qwen3.7-Max, OpenAI Math Proof, Gated DeltaNet-2, Trump AI Order, Microsoft Fara1.5

This edition of The Batch covers five significant AI developments: Alibaba's Qwen3.7-Max reasoning model with 1M token context and agentic capabilities ranking fifth on the Artificial Analysis Intelligence Index; an OpenAI reasoning model resolving the 80-year-old Erdős planar unit distance problem; Nvidia's Gated DeltaNet-2 outperforming Mamba-3 and other linear attention architectures; Trump pulling back a proposed AI regulation executive order; and Microsoft Research's Fara1.5 computer-use agent family beating OpenAI Operator and Google Gemini on the Online-Mind2Web benchmark.

7The Batch·19d ago·source ↗

Data Points: OpenAI and Microsoft sever their exclusive relationship

This edition of The Batch covers several major AI industry developments: OpenAI has revised its partnership with Microsoft, ending exclusivity while retaining Microsoft as primary cloud partner through 2032 and gaining freedom to deploy on AWS and Google Cloud. DeepSeek released V4 model weights featuring 1M-token context and Huawei Ascend chip optimization, though it trails leading open and closed models on aggregate benchmarks. Google and Amazon are deepening investments in Anthropic with up to $40B and $25B respectively in funding-for-compute deals, and an agentic AI system autonomously designed a functional RISC-V CPU from a 219-word spec in 12 hours.

6The Batch·23d ago·source ↗

Data Points: DeepSWE Benchmark, DeepSeek V4 Price Cuts, MAI-Image-2.5, Mythos Security Findings, MCP Stateless Update

This edition of The Batch covers five distinct AI developments: Datacurve's DeepSWE benchmark claims to fix critical grading flaws in SWE-bench Pro with hand-written verifiers and harder tasks; DeepSeek permanently cuts V4 Pro prices by 75%; Microsoft's MAI-Image-2.5 debuts third on the Arena leaderboard; Anthropic's Claude Mythos Preview found over 10,000 high/critical vulnerabilities in the first month of Project Glasswing, with remediation badly lagging discovery; and the Model Context Protocol proposes removing stateful sessions to enable stateless, load-balanced remote servers. Each item reflects meaningful movement in evaluation methodology, inference economics, multimodal generation, AI-assisted security, and agent tooling infrastructure.

7The Batch·19d ago·source ↗

Data Points: China Blocks Meta-Manus Deal; Microsoft-OpenAI Restructure; Nvidia Nemotron Omni; Grok 4.3; OpenAI AGI Principles; IBM Granite 4.1

A roundup of major AI developments: Chinese regulators blocked Meta's acquisition of Singapore-based agent startup Manus on security grounds; Microsoft and OpenAI restructured their partnership, with OpenAI gaining freedom to sell on rival clouds while Microsoft loses its AGI-access clause; Nvidia released Nemotron 3 Nano Omni, a 30B MoE omnimodal open-weights model for local agent deployment; xAI shipped Grok 4.3 with a 1M-token context window at reduced pricing; OpenAI published AGI operating principles; and IBM released Granite 4.1 across language, vision, speech, embedding, and safety modalities.

6The Batch·1mo ago·source ↗

Data Points: Thinking Machines Interaction Model, ERNIE 5.1, Co-Mathematician, RL Conductor, and More

This edition of The Batch covers five notable AI developments: Thinking Machines' research preview of an 'interaction model' with a 200ms micro-turn multimodal architecture; Baidu's ERNIE 5.1, a compressed derivative of ERNIE 5.0 using only 6% of typical pre-training compute; Google DeepMind's Co-Mathematician collaborative workbench reaching 48% on FrontierMath Tier 4; a 7B RL Conductor model that orchestrates multi-agent workflows via reinforcement learning; and Google's Magic Pointer cursor system powered by Gemini. Secondary items include GitHub Copilot pricing restructuring ahead of usage-based billing.

6The Batch·17d ago·source ↗

Data Points: NemoClaw enterprise stack, GPT-5.4 mini/nano, Nemotron 3 Nano 4B, Midjourney V8, and Mamba-3

A multi-item roundup covers several AI developments: Nvidia unveiled NemoClaw at GTC 2026, an enterprise software stack integrating with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. OpenAI released GPT-5.4 mini and nano, smaller variants optimized for speed with benchmark results on SWE-Bench Pro and OSWorld-Verified, priced at $0.75 and $0.20 per million input tokens respectively. Nvidia also released Nemotron 3 Nano 4B, a hybrid Mamba-Transformer 4B parameter on-device model. Additional items cover Midjourney V8 alpha (5x faster, diffusion-only) and Mamba-3, a 1.5B state space model from CMU and Together.AI with improved accuracy over Mamba-2.

6The Batch·19d ago·source ↗

Data Points: Nvidia Ising Models for Quantum Computing, Meta Muse Spark, GitHub Rubber Duck, Anthropic Claude Managed Agents, GPT-5.4-Cyber

Nvidia released Ising, a family of open AI models targeting quantum processor calibration and error correction, achieving 2.5x faster and 3x more accurate decoding than pyMatching, with adoption by Fermilab, Harvard, and others. Meta announced Muse Spark, a small multimodal model powering a new AI assistant series for its apps and glasses. GitHub introduced Rubber Duck, a cross-model review feature pairing Claude with GPT-5.4 for two-pass coding agent validation. Anthropic launched Claude Managed Agents, a managed infrastructure platform for enterprise autonomous AI deployment, while OpenAI expanded its Trusted Access for Cyber program with GPT-5.4-Cyber, a fine-tuned defensive cybersecurity model.

6The Batch·18d ago·source ↗

The Batch Issue 346: Nvidia Nemotron Super 120B, OpenAI-Amazon Deal, Regulatory Commentary

The Batch's weekly digest covers Nvidia's release of Nemotron 3 Super 120B-A12B, an open-weights hybrid mamba-2/transformer/MoE model with 1M token context trained on 25 trillion tokens, positioned as a speed leader in its size class for agentic applications. The issue also touches on OpenAI's Amazon deal and Grok video pricing cuts. Editor Andrew Ng's letter addresses the White House's proposed federal AI preemption framework and critiques what he characterizes as coordinated anti-AI messaging campaigns. Multiple significant industry developments are bundled in a single newsletter digest.