7Mistral AI News·1mo ago

Mistral Releases Leanstral: First Open-Source Code Agent for Lean 4 Formal Verification

Mistral AI has released Leanstral, an open-source code agent built on a sparse 120B/6B-active-parameter architecture, designed specifically for formal proof engineering in Lean 4. The model targets realistic proof engineering workflows rather than isolated math competition problems, and is benchmarked on FLTEval, a new evaluation suite tied to the Fermat's Last Theorem formalization project. Leanstral is released under Apache 2.0 with a free API endpoint and MCP support, and demonstrates competitive performance against Claude Sonnet 4.6 at roughly 1/15th the cost. The release positions formal verification as a scalable alternative to human code review for high-stakes software and mathematics.

Evaluation and Benchmarking Open Weights Progress Inference Economics AI Safety Research Agent and Tool Ecosystem Mistral AI Claude Sonnet 4 Claude Opus 4.6 Qwen3.5 397B A17B FLTEval Leanstral lean-lsp-mcp Lean 4 Mistral Vibe Kimi K2.5 Model Context Protocol

Related guides (4)

Claude Opus 4.6

Claude Opus 4.6: Anthropic's Milestone Model for Long-Context and Agentic Work

Read asBeginner In-depth

Model Context ProtocolConcept

Model Context Protocol (MCP): The Universal Plug for AI Agents

Read asBeginner In-depth

Mistral AI

Mistral AI: Europe's Open-Weight Frontier Lab

Read asIn-depth

AI Safety ResearchTopic guide

AI Safety Research: From Lab Evals to Geopolitical Flashpoint

Read asIn-depth

Related events (8)

7Mistral Ai News·19d ago·source ↗

Mistral AI Releases Devstral: Apache 2.0 Agentic Coding Model with SWE-Bench SOTA

Mistral AI, in collaboration with All Hands AI, releases Devstral, an agentic LLM specialized for software engineering tasks under the Apache 2.0 license. The model achieves 46.8% on SWE-Bench Verified, surpassing prior open-source state-of-the-art by over 6 percentage points and outperforming larger models like DeepSeek-V3-0324 (671B) and Qwen3 232B-A22B under the same OpenHands scaffold. Devstral is small enough to run on a single RTX 4090 or a Mac with 32GB RAM, and is available via Mistral's API at $0.1/M input tokens, as well as on HuggingFace, Ollama, and other platforms. Mistral indicates a larger agentic coding model is in development.

Frontier Model Releases Evaluation and Benchmarking DeepSeek-V3-0324 Mistral AI GPT-4.1 mini +10 more

7Mistral Ai News·19d ago·source ↗

Mistral AI Founding Manifesto and Mistral 7B Release

Mistral AI published its founding mission statement alongside the release of Mistral 7B, a 7-billion-parameter open-weights language model released under Apache 2.0. The model claims to outperform all available open models up to 13B parameters on standard English and code benchmarks, produced in three months from a standing start. The post articulates Mistral's strategic thesis: open-weight models will outcompete proprietary black-box APIs for most enterprise use cases, drawing analogies to Linux, WebKit, and Kubernetes. The company signals intent to release progressively larger frontier models while building a commercial offering around on-premise and VPC deployment.

Frontier Model Releases Open Weights Progress Mistral AI Apache 2.0 DeepMind +8 more

7Mistral Ai News·19d ago·source ↗

Mistral AI Releases Codestral: 22B Open-Weight Code Generation Model

Mistral AI has released Codestral, a 22B open-weight model explicitly designed for code generation, supporting 80+ programming languages with a 32k context window. The model is available under a non-production license on HuggingFace, with commercial licenses available on request, and is accessible via a dedicated API endpoint (codestral.mistral.ai) free during an 8-week beta. Codestral claims state-of-the-art performance on RepoBench, HumanEval, and fill-in-the-middle benchmarks, outperforming DeepSeek Coder 33B and matching or exceeding GPT-4-Turbo on some language-specific evals. Integrations are available with LlamaIndex, LangChain, Continue.dev, and Tabnine for IDE-based developer workflows.

Frontier Model Releases Evaluation and Benchmarking Mistral AI LlamaIndex GPT-4 Turbo +17 more

8Mistral Ai News·1mo ago·source ↗

Mistral Small 4: Unified Multimodal, Reasoning, and Coding MoE Model Released Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B-parameter Mixture-of-Experts model (6B active per token) that unifies capabilities previously split across Magistral (reasoning), Pixtral (multimodal), and Devstral (coding agents) into a single open-weights model. The model features a 256k context window, configurable reasoning effort via a `reasoning_effort` parameter, native text and image input support, and is released under Apache 2.0. Mistral claims 40% latency reduction and 3x throughput improvement over Mistral Small 3, with benchmark results showing competitive performance against GPT-OSS 120B and Qwen models while producing significantly shorter outputs. The release includes day-0 availability as an NVIDIA NIM and support across vLLM, llama.cpp, SGLang, and Transformers.

Long Context Evolution Frontier Model Releases Mistral AI Mistral Small 4 Pixtral +14 more

7Mistral Ai News·19d ago·source ↗

Mistral Small 3: 24B Latency-Optimized Open-Weight Model Released Under Apache 2.0

Mistral AI has released Mistral Small 3, a 24B-parameter instruction-tuned model optimized for low latency, achieving over 81% on MMLU at 150 tokens/s on a single GPU. The model is competitive with Llama 3.3 70B and Qwen 32B while being more than 3x faster on equivalent hardware, and is released under Apache 2.0 for both pretrained and instruction-tuned checkpoints. It is explicitly not trained with RL or synthetic data, positioning it as a base model for community fine-tuning and reasoning capability development. Deployment targets include local inference on consumer hardware (RTX 4090, MacBook 32GB RAM), agentic function calling, and domain-specific fine-tuning.

Frontier Model Releases Open Weights Progress Mistral AI Mistral Small 4 Ollama +12 more

8arXiv · cs.AI·29d ago·source ↗

Large-Scale Evaluation of LLM-Driven Formal Proof Search on Open Mathematical Problems

Researchers present the first large-scale evaluation of LLM-based formal proof search on genuinely open mathematical problems, using Lean as a verification backend. Their most capable agent autonomously resolved 9 of 353 open Erdős problems and proved 44 of 492 OEIS conjectures, at a cost of a few hundred dollars per problem. The system is already being deployed in active research across combinatorics, optimization, graph theory, algebraic geometry, and quantum optics. The study also compares agent architectures, finding that more sophisticated designs outperform simple generate-and-verify loops on the hardest problems.

Frontier Model Releases Evaluation and Benchmarking large language models Erdős Problems OEIS Conjectures +3 more

7Mistral Ai News·19d ago·source ↗

Mistral AI Launches Mistral Code: Enterprise AI Coding Assistant with On-Prem Deployment

Mistral AI has announced Mistral Code, an enterprise-grade AI coding assistant currently in private beta for JetBrains IDEs and VSCode. The product bundles four specialized models (Codestral, Codestral Embed, Devstral, Mistral Medium) with an IDE plugin, admin controls, and deployment options ranging from serverless to air-gapped on-premises GPUs. It is built on a fork of the open-source Continue project with enterprise additions including RBAC, audit logging, and fine-tuning on private repositories. Early enterprise adopters include Abanca, SNCF (4,000 developers), and Capgemini (1,500+ developers).

Frontier Model Releases Inference Economics Mistral AI SNCF Abanca +11 more

8Mistral Ai News·1mo ago·source ↗

Mistral Releases Devstral 2 (123B) and Devstral Small 2 (24B) Coding Models Plus Vibe CLI Agent

Mistral AI has released Devstral 2, a 123B-parameter open-weight coding model scoring 72.2% on SWE-bench Verified, and Devstral Small 2, a 24B model scoring 68.0% on the same benchmark and deployable on consumer hardware. Both models support a 256K context window and are permissively licensed (modified MIT and Apache 2.0 respectively). Mistral also launched Vibe CLI, an open-source terminal-based coding agent powered by Devstral that supports multi-file orchestration, natural language code editing, and IDE integration via Agent Communication Protocol. Devstral 2 is currently free via API with post-free pricing of $0.40/$2.00 per million tokens input/output.

Long Context Evolution Frontier Model Releases Devstral 2 Small Mistral AI Kimi K2 +13 more