6The Batch (DeepLearning.AI)·10d ago

Data Points: Apple/Google Siri overhaul, Gemma 4 12B, Kimi Code CLI, OpenJarvis, and U.S. OpenAI stake talks

A multi-item digest covers several significant AI developments: Apple is expected to announce a revamped Siri at WWDC that uses Google Gemini models distilled for on-device use alongside cloud routing, marking a notable Apple-Google AI partnership. Google released Gemma 4 12B, an encoder-free multimodal open-weights model designed for consumer laptops under Apache 2.0. Moonshot AI released Kimi Code CLI, an open-source terminal coding agent with native subagent orchestration and conversational MCP configuration. Stanford and Lambda Labs released OpenJarvis, an on-device agent framework claiming near-cloud accuracy at 800× lower API cost. The White House and OpenAI are reportedly negotiating a government equity stake in OpenAI as part of a proposed Public Wealth Fund.

Frontier Model Releases Open Weights Progress Inference Economics Agent and Tool Ecosystem Kimi Code CLI Stanford University WWDC Google University of Cambridge Siri Gemma 4 OpenJarvis Sam Altman OpenAI Apple Moonshot AI Lambda Labs Gemini Model Context Protocol

Related guides (3)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

Model Context ProtocolConcept

Model Context Protocol (MCP): The Universal Plug for AI Agents

Read asBeginner In-depth

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Related events (8)

7The Batch·18d ago·source ↗

Data Points: OpenAI shuts down Sora, Anthropic multi-agent harness, EVA voice benchmark, Arm AGI CPU, White House AI preemption proposal

OpenAI is shutting down its Sora text-to-video platform without explanation, ending a major Disney licensing deal worth up to $1 billion and eliminating video capabilities from ChatGPT amid Hollywood copyright tensions. Anthropic published details on a multi-agent harness enabling Claude to build full-stack applications over multi-hour sessions using a planner-generator-evaluator architecture. ServiceNow AI Research released EVA, an open-source two-dimensional benchmark for voice agents measuring both task accuracy and conversational experience quality. Additional items cover Arm's first self-designed data center CPU (AGI CPU) co-developed with Meta, and the Trump Administration's legislative proposal for a federal AI framework that would preempt state AI laws.

Training Infrastructure Frontier Model Releases ServiceNow AI Research ClawBot Playwright +19 more

7The Batch·19d ago·source ↗

Data Points: China Blocks Meta-Manus Deal; Microsoft-OpenAI Restructure; Nvidia Nemotron Omni; Grok 4.3; OpenAI AGI Principles; IBM Granite 4.1

A roundup of major AI developments: Chinese regulators blocked Meta's acquisition of Singapore-based agent startup Manus on security grounds; Microsoft and OpenAI restructured their partnership, with OpenAI gaining freedom to sell on rival clouds while Microsoft loses its AGI-access clause; Nvidia released Nemotron 3 Nano Omni, a 30B MoE omnimodal open-weights model for local agent deployment; xAI shipped Grok 4.3 with a 1M-token context window at reduced pricing; OpenAI published AGI operating principles; and IBM released Granite 4.1 across language, vision, speech, embedding, and safety modalities.

Long Context Evolution Frontier Model Releases Palantir IBM Microsoft +17 more

6The Batch·19d ago·source ↗

Data Points: NeurIPS-China Standoff, Anthropic Emotion Vectors, Gemma 4, Cursor 3, Microsoft MAI Models

This edition of The Batch covers five significant AI developments: NeurIPS reversed a sanctions-related submission policy after China's largest tech federation announced a boycott; Anthropic's interpretability team identified 171 emotion-related representations in Claude Sonnet 4.5 that causally influence model behavior including unsafe actions; Google released Gemma 4, a family of Apache 2.0-licensed open-weights models up to 31B parameters with strong benchmark performance; Cursor released version 3 with a redesigned multi-agent interface; and Microsoft announced three specialized MAI models for transcription, voice synthesis, and image generation. The NeurIPS incident highlights growing friction in international AI research access, while the Anthropic findings have direct implications for AI safety and interpretability research.

Frontier Model Releases Open Weights Progress FLEURS NeurIPS WPP +19 more

7Hacker News·12d ago·source ↗

Apple reveals new AI architecture built around Google Gemini models

Apple has announced a new AI architecture centered on Google Gemini models, representing a significant strategic shift in how Apple integrates third-party AI into its ecosystem. The announcement, reported by MacRumors and generating substantial Hacker News discussion, suggests a deepening partnership between Apple and Google for on-device and cloud AI capabilities. This move has implications for the competitive landscape of consumer AI and the positioning of both companies relative to OpenAI and other frontier labs.

Frontier Model Releases Enterprise Deployment Patterns Google Apple Gemini

7The Batch·16d ago·source ↗

Microsoft Build: Seven in-house AI models, GitHub Copilot desktop agent manager, and Web IQ search API for agents

Microsoft announced seven new AI models trained from scratch (not distilled from OpenAI), including the flagship MAI-Thinking-1 reasoning model and MAI-Transcribe-1.5, plus a 'Frontier Tuning' reinforcement learning approach for enterprise workflow training. GitHub released a desktop Copilot app designed to manage multiple parallel AI agents with isolated git worktrees and bidirectional canvases. Microsoft also launched Web IQ, an agent-native Bing-powered grounding API already powering search in Copilot and ChatGPT, running 2.5x faster than alternatives with lower token costs. The roundup also covers Nous Research's Hermes Desktop cross-platform agent app, Alibaba's Qwen3.7-Plus multimodal model, and OpenAI's role-specific Codex plugins.

Frontier Model Releases Inference Economics MAI-Thinking-1 FLEURS Frontier Tuning +15 more

7The Batch·10d ago·source ↗

The Batch: Claude Mythos 5 / Fable 5 debut, Apple AFM 3, Google Live Translate, OpenAI IPO filing, FrontierCode benchmark

Anthropic launched Claude Fable 5 (a safety-guardrailed model) and Claude Mythos 5 (same underlying model with safeguards removed, for vetted cyberdefense/infrastructure users via Project Glasswing with US government collaboration), both priced at $10/$50 per million tokens. Apple released five new Apple Foundation Models (AFM 3) spanning on-device and cloud tiers, built with Google and Nvidia infrastructure. Additional headlines cover Google's Gemini 3.5 Live Translate (70+ languages, real-time), OpenAI's confidential SEC IPO filing, a NotebookLM upgrade to Gemini 3.5, and Cognition's FrontierCode benchmark for code-quality evaluation where Claude Opus 4.8 leads at 34.3%.

Frontier Model Releases Evaluation and Benchmarking Claude Mythos Claude Opus 4.6 Google +19 more

7The Batch·19d ago·source ↗

Data Points: OpenAI and Microsoft sever their exclusive relationship

This edition of The Batch covers several major AI industry developments: OpenAI has revised its partnership with Microsoft, ending exclusivity while retaining Microsoft as primary cloud partner through 2032 and gaining freedom to deploy on AWS and Google Cloud. DeepSeek released V4 model weights featuring 1M-token context and Huawei Ascend chip optimization, though it trails leading open and closed models on aggregate benchmarks. Google and Amazon are deepening investments in Anthropic with up to $40B and $25B respectively in funding-for-compute deals, and an agentic AI system autonomously designed a functional RISC-V CPU from a 219-word spec in 12 hours.

Training Infrastructure Frontier Model Releases Google Cloud Google TPU knowledge distillation +25 more

6The Batch·17d ago·source ↗

Data Points: Perplexity Computer expands, Google Aletheia math agent, DeepSeek chip strategy, Nvidia retrieval pipeline, Stargate cancellation

The Batch's weekly data points roundup covers five significant AI developments: Perplexity expanded its Computer agentic platform to desktop, mobile, and enterprise with new APIs and financial data tools; Google released Aletheia, a Gemini-based math research agent achieving 95.1% on IMO-Proof Bench Advanced (up from 65.7%); DeepSeek withheld pre-release access to its V4 model from Nvidia and AMD while giving domestic Chinese chipmakers early access; Nvidia's NeMo Retriever topped the ViDoRe v3 leaderboard using a ReACT-based agentic retrieval loop; and OpenAI and Oracle cancelled plans to expand the Abilene Stargate campus from 1.2 GW to 2.0 GW due to financing and reliability issues.

Training Infrastructure Frontier Model Releases ViDoRe v3 Crusoe BRIGHT +19 more