benchmark
USAMO 2026
benchmarkactiveprovisional
usamo-2026-0e448be7·1 events·first seen 4d agoAliases: USAMO 2026
Co-occurring entities
More like this (12)
Recent events (1)
MaxProof achieves gold-medal-level performance on IMO 2025 and USAMO 2026 via population-level test-time scaling
MiniMax introduces MaxProof, a test-time scaling framework for competition-level mathematical proof built on their MiniMax-M3 model. The system trains three capabilities — proof generation, verification, and critique-conditioned repair — then at inference time runs tournament selection over a population of candidate proofs. MaxProof scores 35/42 on IMO 2025 and 36/42 on USAMO 2026, exceeding the human gold-medal threshold on both competitions.