benchmark
FLTEval
benchmarkactive
flteval-c8576f1b·1 events·first seen 1mo agoAliases: FLTEval
Co-occurring entities
More like this (12)
Recent events (1)
Mistral Releases Leanstral: First Open-Source Code Agent for Lean 4 Formal Verification
Mistral AI has released Leanstral, an open-source code agent built on a sparse 120B/6B-active-parameter architecture, designed specifically for formal proof engineering in Lean 4. The model targets realistic proof engineering workflows rather than isolated math competition problems, and is benchmarked on FLTEval, a new evaluation suite tied to the Fermat's Last Theorem formalization project. Leanstral is released under Apache 2.0 with a free API endpoint and MCP support, and demonstrates competitive performance against Claude Sonnet 4.6 at roughly 1/15th the cost. The release positions formal verification as a scalable alternative to human code review for high-stakes software and mathematics.