benchmark
OEIS Conjectures
benchmarkactive
oeis-conjectures-615be3bf·1 events·first seen 26d agoAliases: OEIS Conjectures
Co-occurring entities
More like this (12)
Counterexample-Guided Inductive Synthesis (CEGIS)IMO-Proof Bench AdvancedErdős ProblemsIsingMIMIC-ESIResearch Gap InferenceInference EndpointsOperads for compositional reasoning in LLMsEquilibrium Reasoners (EqR)Exploring Extrinsic and Intrinsic Properties for Effective Reasoning with Code InterpreterAxiom MathShannon-Hartley Theorem
Recent events (1)
Large-Scale Evaluation of LLM-Driven Formal Proof Search on Open Mathematical Problems
Researchers present the first large-scale evaluation of LLM-based formal proof search on genuinely open mathematical problems, using Lean as a verification backend. Their most capable agent autonomously resolved 9 of 353 open Erdős problems and proved 44 of 492 OEIS conjectures, at a cost of a few hundred dollars per problem. The system is already being deployed in active research across combinatorics, optimization, graph theory, algebraic geometry, and quantum optics. The study also compares agent architectures, finding that more sophisticated designs outperform simple generate-and-verify loops on the hardest problems.