Google's Aletheia agent uses Gemini 3 Deep Think to generate novel solutions to unsolved Erdős problems
Google researchers introduced Aletheia, an agentic workflow using Gemini 3 Deep Think that generates, verifies, and revises solutions to previously unsolved mathematical problems. Applied to Erdős problems, Aletheia produced 13 correct solutions out of 200 evaluated, with 4 being genuinely novel contributions not found in existing literature. The announcement also reveals Gemini 3 Deep Think's benchmark performance: 48.4% on HLE, 84.6% on ARC-AGI-2, and 93.8% on GPQA Diamond. The system demonstrates both the promise and current limitations of AI-assisted mathematical research, with a 6.5% correct-under-intended-interpretation rate on a hard problem set.
Related guides (4)

Google: The AI Lab That Builds Everything from DNA Models to Your Phone's Assistant
Related events (8)
AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms
DeepMind has announced AlphaEvolve, a coding agent powered by Gemini that autonomously evolves algorithms for mathematical and practical computing applications. The system combines large language model creativity with automated evaluators to iteratively improve algorithmic solutions. It represents a significant step in AI-driven algorithm discovery, extending DeepMind's prior work in this space (e.g., AlphaTensor, FunSearch). The announcement comes from DeepMind's official blog, indicating a substantive capability release rather than a research preview.
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think
DeepMind published a blog post highlighting the research impact of Gemini Deep Think across mathematical and scientific domains. The post references multiple research papers demonstrating the model's growing utility in technical discovery workflows. This appears to be a capability showcase for DeepMind's extended-thinking variant of Gemini, positioning it as a tool for frontier scientific research.
AlphaEvolve: How our Gemini-powered coding agent is scaling impact across fields
DeepMind published a blog post detailing the real-world impact of AlphaEvolve, a Gemini-powered coding agent designed to discover and optimize algorithms. The post covers applications spanning business operations, infrastructure, and scientific research. AlphaEvolve represents a deployment of LLM-driven evolutionary algorithm search at scale across multiple domains.
Data Points: Perplexity Computer expands, Google Aletheia math agent, DeepSeek chip strategy, Nvidia retrieval pipeline, Stargate cancellation
The Batch's weekly data points roundup covers five significant AI developments: Perplexity expanded its Computer agentic platform to desktop, mobile, and enterprise with new APIs and financial data tools; Google released Aletheia, a Gemini-based math research agent achieving 95.1% on IMO-Proof Bench Advanced (up from 65.7%); DeepSeek withheld pre-release access to its V4 model from Nvidia and AMD while giving domestic Chinese chipmakers early access; Nvidia's NeMo Retriever topped the ViDoRe v3 leaderboard using a ReACT-based agentic retrieval loop; and OpenAI and Oracle cancelled plans to expand the Abilene Stargate campus from 1.2 GW to 2.0 GW due to financing and reliability issues.
Gemini 3.5: Frontier Intelligence with Action
Google DeepMind has announced Gemini 3.5, a new model generation positioned around agentic capabilities and complex workflow execution. The announcement emphasizes action-oriented AI, suggesting a focus on tool use, multi-step reasoning, and autonomous task completion. The blog post is brief, indicating this may be an initial announcement with further details to follow.
Google DeepMind Rolls Out Deep Think in Gemini App for Ultra Subscribers
Google DeepMind is making Deep Think available in the Gemini app for Google AI Ultra subscribers, marking a broader consumer rollout of its advanced reasoning capability. Additionally, select mathematicians are being granted access to the full Gemini 2.5 Deep Think model that was entered into the International Mathematical Olympiad (IMO) competition. This deployment follows DeepMind's earlier IMO-related capability demonstrations and represents a step toward productizing frontier mathematical reasoning.
Gemini 2.5 Deep Think Achieves Gold-Medal Level at ICPC World Finals
Google DeepMind reports that Gemini 2.5 Deep Think has achieved gold-medal-level performance at the International Collegiate Programming Contest (ICPC) World Finals, one of the most prestigious competitive programming competitions globally. The announcement frames this as a significant advance in abstract problem-solving capability. This follows a pattern of frontier labs using competitive programming benchmarks to demonstrate reasoning breakthroughs, similar to prior milestones at IOI and Codeforces. The specific score, problem set, and evaluation methodology are not detailed in the announcement body.
Gemini 3.1 Pro: A smarter model for your most complex tasks
Google DeepMind has announced Gemini 3.1 Pro, a new model positioned for complex reasoning tasks where simple answers are insufficient. The announcement comes from the official DeepMind blog, indicating a flagship-tier release. The body content is minimal, providing little technical detail beyond the positioning statement.


