benchmark
PGPS9K
benchmarkactiveprovisional
pgps9k-61f633d0·1 events·first seen 12h agoAliases: PGPS9K
Co-occurring entities
More like this (12)
Recent events (1)
SD-GPS: Solver-Driven Autoformalization and Theorem Proposing for Geometry Problem Solving
Researchers propose SD-GPS, a neuro-symbolic framework for geometry problem solving that treats a symbolic solver as an execution oracle during both formalization and deduction stages. The system combines solvability-guided reinforcement learning for autoformalization (built on QwenVL3-2B) with an impasse-aware agent that proposes and symbolically verifies auxiliary lemmas. Evaluations on Geometry3K and PGPS9K show SD-GPS outperforms existing multimodal, neural, and neuro-symbolic baselines across multiple task regimes. The work advances the line of research on grounding neural agents in formal systems for verifiable mathematical reasoning.