7Mistral AI News·19d ago

Mistral AI Releases Devstral: Apache 2.0 Agentic Coding Model with SWE-Bench SOTA

Mistral AI, in collaboration with All Hands AI, releases Devstral, an agentic LLM specialized for software engineering tasks under the Apache 2.0 license. The model achieves 46.8% on SWE-Bench Verified, surpassing prior open-source state-of-the-art by over 6 percentage points and outperforming larger models like DeepSeek-V3-0324 (671B) and Qwen3 232B-A22B under the same OpenHands scaffold. Devstral is small enough to run on a single RTX 4090 or a Mac with 32GB RAM, and is available via Mistral's API at $0.1/M input tokens, as well as on HuggingFace, Ollama, and other platforms. Mistral indicates a larger agentic coding model is in development.

Frontier Model Releases Evaluation and Benchmarking Open Weights Progress Inference Economics Agent and Tool Ecosystem DeepSeek-V3-0324 Mistral AI GPT-4.1 mini Apache 2.0 Devstral 2 All Hands AI SWE-Bench Verified SWE-Agent Qwen3-235B OpenHands

Related guides (4)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Mistral AI

Mistral AI: Europe's Open-Weight Frontier Lab

Read asIn-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How the Infrastructure Layer Around LLMs Is Consolidating

Read asIn-depth

Related events (8)

7Mistral Ai News·19d ago·source ↗

Mistral AI Releases Devstral Medium and Devstral Small 1.1 for Agentic Coding

Mistral AI, in collaboration with All Hands AI, has released two new agentic coding models: Devstral Small 1.1 (24B parameters, Apache 2.0, 53.6% on SWE-Bench Verified) and Devstral Medium (61.6% on SWE-Bench Verified, API-only). Devstral Medium is positioned as a cost-performance leader, claiming to surpass Gemini 2.5 Pro and GPT-4.1 at roughly one-quarter the price, priced at $0.4/M input and $2/M output tokens. Devstral Small 1.1 sets a new state-of-the-art among open models for code agents without test-time scaling, and supports both Mistral function calling and XML formats for broad agentic scaffold compatibility.

Frontier Model Releases Evaluation and Benchmarking Devstral 2 Small Mistral AI All Hands AI +10 more

8Mistral Ai News·1mo ago·source ↗

Mistral Releases Devstral 2 (123B) and Devstral Small 2 (24B) Coding Models Plus Vibe CLI Agent

Mistral AI has released Devstral 2, a 123B-parameter open-weight coding model scoring 72.2% on SWE-bench Verified, and Devstral Small 2, a 24B model scoring 68.0% on the same benchmark and deployable on consumer hardware. Both models support a 256K context window and are permissively licensed (modified MIT and Apache 2.0 respectively). Mistral also launched Vibe CLI, an open-source terminal-based coding agent powered by Devstral that supports multi-file orchestration, natural language code editing, and IDE integration via Agent Communication Protocol. Devstral 2 is currently free via API with post-free pricing of $0.40/$2.00 per million tokens input/output.

Long Context Evolution Frontier Model Releases Devstral 2 Small Mistral AI Kimi K2 +13 more

8Mistral Ai News·1mo ago·source ↗

Mistral Small 4: Unified Multimodal, Reasoning, and Coding MoE Model Released Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B-parameter Mixture-of-Experts model (6B active per token) that unifies capabilities previously split across Magistral (reasoning), Pixtral (multimodal), and Devstral (coding agents) into a single open-weights model. The model features a 256k context window, configurable reasoning effort via a `reasoning_effort` parameter, native text and image input support, and is released under Apache 2.0. Mistral claims 40% latency reduction and 3x throughput improvement over Mistral Small 3, with benchmark results showing competitive performance against GPT-OSS 120B and Qwen models while producing significantly shorter outputs. The release includes day-0 availability as an NVIDIA NIM and support across vLLM, llama.cpp, SGLang, and Transformers.

Long Context Evolution Frontier Model Releases Mistral AI Mistral Small 4 Pixtral +14 more

7Mistral Ai News·19d ago·source ↗

Mistral Small 3: 24B Latency-Optimized Open-Weight Model Released Under Apache 2.0

Mistral AI has released Mistral Small 3, a 24B-parameter instruction-tuned model optimized for low latency, achieving over 81% on MMLU at 150 tokens/s on a single GPU. The model is competitive with Llama 3.3 70B and Qwen 32B while being more than 3x faster on equivalent hardware, and is released under Apache 2.0 for both pretrained and instruction-tuned checkpoints. It is explicitly not trained with RL or synthetic data, positioning it as a base model for community fine-tuning and reasoning capability development. Deployment targets include local inference on consumer hardware (RTX 4090, MacBook 32GB RAM), agentic function calling, and domain-specific fine-tuning.

Frontier Model Releases Open Weights Progress Mistral AI Mistral Small 4 Ollama +12 more

8Mistral Ai News·1mo ago·source ↗

Mistral Launches Medium 3.5 (128B Open Weights), Remote Cloud Coding Agents in Vibe, and Work Mode in Le Chat

Mistral AI has released Mistral Medium 3.5, a 128B dense open-weights model with a 256k context window, configurable reasoning effort, and a vision encoder trained from scratch, scoring 77.6% on SWE-Bench Verified. Alongside the model, Mistral is launching remote cloud-based coding agents in its Vibe CLI and Le Chat interface, enabling async parallel coding sessions that run independently and notify users on completion. A new Work mode in Le Chat provides a multi-step agentic interface for cross-tool workflows including email, calendar, research, and issue tracking. Mistral Medium 3.5 replaces Devstral 2 as the default model in both Le Chat and the Vibe CLI, and is available for self-hosting on as few as four GPUs under a modified MIT license.

Long Context Evolution Frontier Model Releases Mistral AI Qwen3.5 397B A17B Devstral 2 +10 more

7Mistral Ai News·19d ago·source ↗

Codestral 25.01: Mistral AI Releases Updated Coding Model with 2x Speed and Improved FIM Performance

Mistral AI has released Codestral 25.01, a significant upgrade to its Codestral coding model featuring a more efficient architecture and improved tokenizer that generates code approximately 2x faster than its predecessor. The model claims state-of-the-art performance for fill-in-the-middle (FIM) tasks across sub-100B parameter models, with a 256k context window and support for 80+ programming languages. Benchmarks show improvements over Codestral 2405 and competitive or superior results against DeepSeek Coder V2 lite and DeepSeek Coder 33B on HumanEval and FIM metrics. The model is available via Mistral's API, IDE plugins (VS Code, JetBrains via Continue), and for on-premises/VPC deployment, with cloud availability on Vertex AI and Azure AI Foundry.

Frontier Model Releases Evaluation and Benchmarking Mistral AI HumanEvalFIM Azure Foundry +12 more

7Mistral Ai News·19d ago·source ↗

Mistral AI Releases Codestral: 22B Open-Weight Code Generation Model

Mistral AI has released Codestral, a 22B open-weight model explicitly designed for code generation, supporting 80+ programming languages with a 32k context window. The model is available under a non-production license on HuggingFace, with commercial licenses available on request, and is accessible via a dedicated API endpoint (codestral.mistral.ai) free during an 8-week beta. Codestral claims state-of-the-art performance on RepoBench, HumanEval, and fill-in-the-middle benchmarks, outperforming DeepSeek Coder 33B and matching or exceeding GPT-4-Turbo on some language-specific evals. Integrations are available with LlamaIndex, LangChain, Continue.dev, and Tabnine for IDE-based developer workflows.

Frontier Model Releases Evaluation and Benchmarking Mistral AI LlamaIndex GPT-4 Turbo +17 more

7Mistral Ai News·19d ago·source ↗

Mistral Small 3.1: Multimodal, 128k Context, Apache 2.0 Open-Weight Model

Mistral AI releases Mistral Small 3.1, a ~24B parameter model with multimodal understanding, 128k token context window, and claimed best-in-class performance among small models, outperforming Gemma 3 and GPT-4o Mini on text, multimodal, and multilingual benchmarks. The model runs on a single RTX 4090 or 32GB RAM Mac at 150 tokens/second and is released under Apache 2.0 license with both base and instruct checkpoints. It is available on HuggingFace, Mistral's La Plateforme API, and Google Cloud Vertex AI, with NVIDIA NIM and Azure AI Foundry support coming soon. The release targets enterprise and on-device use cases including document verification, agentic workflows, and domain fine-tuning.

Long Context Evolution Frontier Model Releases Mistral AI Mistral Small 4 MT-Bench +12 more