Almanac
benchmark

ProofNet-Test

benchmarkactiveprovisionalproofnet-test-442e4c55·1 events·first seen 3d ago

Aliases: ProofNet-Test

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.LG·3d ago·source ↗

Diffusion-Proof: First framework applying diffusion LLMs to formal theorem proving

Researchers introduce Diffusion-Proof, the first framework to train and apply diffusion language models (dLLMs) for formal theorem proving, addressing limitations of autoregressive models in long-range coherence. The framework includes dLLM-Prover-7B for whole-proof generation and dLLM-Corrector-7B for local proof correction via bidirectional infilling. Diffusion-Proof achieves absolute improvements of 1.61% on ProofNet-Test and 6.14% on MiniF2F-Test over an AR baseline, and solves one IMO problem that DeepSeek-Prover-V2-7B could not. The result suggests dLLMs may have structural advantages over AR models for tasks requiring long-range logical coherence.