Entity · benchmark

OEIS Conjectures

benchmarkactiveoeis-conjectures-615be3bf·1 events·first seen May 22, 2026

Aliases: OEIS Conjectures

Co-occurring entities

large language models Erdős Problems Lean Formal Proof Search

More like this (12)

Cycle Double Cover Conjecture Counterexample-Guided Inductive Synthesis (CEGIS)Euclid-IR Jacobian Conjecture IMO-Proof Bench Advanced Anti-Periodic Positional Encoding: Möbius Boundary Conditions Make In-Context Retrieval Reliable Epanorthosis Index Erdős Problems Ising MIMIC-ESI Research Gap Inference Inference Endpoints

Recent events (1)

8arXiv · cs.AI·May 22, 2026·source ↗

Large-Scale Evaluation of LLM-Driven Formal Proof Search on Open Mathematical Problems

Researchers present the first large-scale evaluation of LLM-based formal proof search on genuinely open mathematical problems, using Lean as a verification backend. Their most capable agent autonomously resolved 9 of 353 open Erdős problems and proved 44 of 492 OEIS conjectures, at a cost of a few hundred dollars per problem. The system is already being deployed in active research across combinatorics, optimization, graph theory, algebraic geometry, and quantum optics. The study also compares agent architectures, finding that more sophisticated designs outperform simple generate-and-verify loops on the hardest problems.

Frontier Model Releases Evaluation and Benchmarking large language models Erdős Problems OEIS Conjectures +3 more