Almanac
benchmark

PGPS9K

benchmarkactiveprovisionalpgps9k-61f633d0·1 events·first seen 12h ago

Aliases: PGPS9K

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.CL·12h ago·source ↗

SD-GPS: Solver-Driven Autoformalization and Theorem Proposing for Geometry Problem Solving

Researchers propose SD-GPS, a neuro-symbolic framework for geometry problem solving that treats a symbolic solver as an execution oracle during both formalization and deduction stages. The system combines solvability-guided reinforcement learning for autoformalization (built on QwenVL3-2B) with an impasse-aware agent that proposes and symbolically verifies auxiliary lemmas. Evaluations on Geometry3K and PGPS9K show SD-GPS outperforms existing multimodal, neural, and neuro-symbolic baselines across multiple task regimes. The work advances the line of research on grounding neural agents in formal systems for verifiable mathematical reasoning.