6OpenAI Blog·1mo ago

OpenAI Shares First Proof Math Challenge Submissions

OpenAI has published its AI model's proof attempts for the First Proof math challenge, a competition designed to test research-grade mathematical reasoning on expert-level problems. This represents a capability demonstration of OpenAI's models on formal mathematical proof generation. The submission signals continued progress in AI mathematical reasoning at a level approaching or engaging with professional research mathematics.

Frontier Model Releases Evaluation and Benchmarking First Proof OpenAI

Related guides (3)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Evaluation and BenchmarkingTopic guide

Evaluation and Benchmarking: How We Measure AI — and Why It Keeps Getting Harder

Read asBeginner In-depth

Related events (8)

7Openai Blog·1mo ago·source ↗

OpenAI Neural Theorem Prover Solves Formal Math Olympiad Problems in Lean

OpenAI developed a neural theorem prover integrated with the Lean proof assistant that can solve challenging high-school olympiad problems, including problems from AMC12, AIME, and two IMO-adapted problems. The system demonstrates automated formal mathematical reasoning at a level previously requiring human expertise. This represents a significant capability milestone in AI-assisted formal verification and mathematical problem-solving.

Frontier Model Releases Evaluation and Benchmarking AIME Neural Theorem Prover OpenAI +3 more

9Openai Blog·1mo ago·source ↗

An OpenAI model has disproved a central conjecture in discrete geometry

An OpenAI model has disproved a major conjecture in discrete geometry by solving the 80-year-old unit distance problem. This represents a milestone in AI-driven mathematical reasoning, demonstrating that frontier AI systems can produce novel, verifiable mathematical results rather than merely verifying or assisting with known proofs. The announcement comes from OpenAI's official blog, indicating a significant capability demonstration.

Frontier Model Releases Evaluation and Benchmarking discrete geometry OpenAI unit distance problem

8Hacker News·1mo ago·source ↗

An OpenAI Model Disproves a Central Conjecture in Discrete Geometry

An OpenAI model has reportedly disproved a long-standing conjecture in discrete geometry, representing a significant AI-assisted mathematical discovery. This is a notable capability demonstration of AI systems contributing to frontier mathematical research. The announcement comes directly from OpenAI and has generated substantial community discussion on Hacker News with 462 points and 298 comments.

Frontier Model Releases Evaluation and Benchmarking discrete geometry OpenAI

8Openai Blog·1mo ago·source ↗

Advancing science and math with GPT-5.2

OpenAI has released GPT-5.2, described as its strongest model for mathematics and science, achieving state-of-the-art results on GPQA Diamond and FrontierMath benchmarks. The announcement highlights practical research applications including solving an open theoretical problem and generating verified mathematical proofs. The post positions GPT-5.2 as a meaningful step toward AI-assisted scientific discovery.

Frontier Model Releases Evaluation and Benchmarking GPT-5.2 FrontierMath GPQA Diamond +2 more

5Openai Blog·1mo ago·source ↗

OpenAI Trains System Solving Grade School Math Problems at ~55% Accuracy

OpenAI released a system for solving grade school math word problems that achieves roughly twice the accuracy of a fine-tuned GPT-3 model. The system scored 55% on a sample test where 9-12 year olds scored 60%, suggesting near-human performance on elementary math. This work represents an early milestone in neural network mathematical reasoning capabilities.

Frontier Model Releases Evaluation and Benchmarking GPT-3 OpenAI GSM8K

5Openai Blog·1mo ago·source ↗

Generative Language Modeling for Automated Theorem Proving

OpenAI published research on applying generative language models to automated theorem proving, an early exploration of using neural language models to assist formal mathematical reasoning. The work investigates how language models can generate proof steps or complete proofs in formal systems. This represents an early milestone in AI-assisted mathematical reasoning, predating later work like GPT-f and subsequent theorem-proving systems.

Frontier Model Releases Evaluation and Benchmarking automated theorem proving generative language modeling GPT-f +1 more

6Hugging Face Blog·1mo ago·source ↗

How NuminaMath Won the 1st AIMO Progress Prize

NuminaMath won the first AI Mathematical Olympiad (AIMO) Progress Prize, a competition focused on advancing AI capabilities in mathematical reasoning. The blog post details the technical approach and methodology used by the winning team. This represents a notable milestone in AI mathematical problem-solving, a domain considered a key frontier for reasoning capabilities.

Frontier Model Releases Evaluation and Benchmarking AI Mathematical Olympiad NuminaMath Hugging Face +1 more

7Openai Blog·1mo ago·source ↗

Early experiments in accelerating science with GPT-5

OpenAI has published initial research cases demonstrating GPT-5's application to scientific discovery across mathematics, physics, biology, and computer science. The examples highlight human-AI collaboration in generating mathematical proofs and uncovering novel insights. This represents OpenAI's first public documentation of GPT-5's scientific research capabilities beyond general benchmarks.

Frontier Model Releases Evaluation and Benchmarking OpenAI GPT-5.5