7Mistral AI News·19d ago

Mistral AI Releases Devstral Medium and Devstral Small 1.1 for Agentic Coding

Mistral AI, in collaboration with All Hands AI, has released two new agentic coding models: Devstral Small 1.1 (24B parameters, Apache 2.0, 53.6% on SWE-Bench Verified) and Devstral Medium (61.6% on SWE-Bench Verified, API-only). Devstral Medium is positioned as a cost-performance leader, claiming to surpass Gemini 2.5 Pro and GPT-4.1 at roughly one-quarter the price, priced at $0.4/M input and $2/M output tokens. Devstral Small 1.1 sets a new state-of-the-art among open models for code agents without test-time scaling, and supports both Mistral function calling and XML formats for broad agentic scaffold compatibility.

Frontier Model Releases Evaluation and Benchmarking Open Weights Progress Inference Economics Enterprise Deployment Patterns Agent and Tool Ecosystem Devstral 2 Small Mistral AI All Hands AI SWE-Bench Verified Devstral Medium Mistral Code Gemini-2.5-Pro GPT-4.1 OpenHands

Related guides (5)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Mistral AI

Mistral AI: Europe's Open-Weight Frontier Lab

Read asIn-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner

Enterprise Deployment PatternsTopic guide

Enterprise Deployment Patterns: From LLM Demo to Production Reality

Read asIn-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How the Infrastructure Layer Around LLMs Is Consolidating

Read asIn-depth

Related events (8)

7Mistral Ai News·19d ago·source ↗

Mistral AI Releases Devstral: Apache 2.0 Agentic Coding Model with SWE-Bench SOTA

Mistral AI, in collaboration with All Hands AI, releases Devstral, an agentic LLM specialized for software engineering tasks under the Apache 2.0 license. The model achieves 46.8% on SWE-Bench Verified, surpassing prior open-source state-of-the-art by over 6 percentage points and outperforming larger models like DeepSeek-V3-0324 (671B) and Qwen3 232B-A22B under the same OpenHands scaffold. Devstral is small enough to run on a single RTX 4090 or a Mac with 32GB RAM, and is available via Mistral's API at $0.1/M input tokens, as well as on HuggingFace, Ollama, and other platforms. Mistral indicates a larger agentic coding model is in development.

Frontier Model Releases Evaluation and Benchmarking DeepSeek-V3-0324 Mistral AI GPT-4.1 mini +10 more

8Mistral Ai News·1mo ago·source ↗

Mistral Releases Devstral 2 (123B) and Devstral Small 2 (24B) Coding Models Plus Vibe CLI Agent

Mistral AI has released Devstral 2, a 123B-parameter open-weight coding model scoring 72.2% on SWE-bench Verified, and Devstral Small 2, a 24B model scoring 68.0% on the same benchmark and deployable on consumer hardware. Both models support a 256K context window and are permissively licensed (modified MIT and Apache 2.0 respectively). Mistral also launched Vibe CLI, an open-source terminal-based coding agent powered by Devstral that supports multi-file orchestration, natural language code editing, and IDE integration via Agent Communication Protocol. Devstral 2 is currently free via API with post-free pricing of $0.40/$2.00 per million tokens input/output.

Long Context Evolution Frontier Model Releases Devstral 2 Small Mistral AI Kimi K2 +13 more

8Mistral Ai News·1mo ago·source ↗

Mistral Small 4: Unified Multimodal, Reasoning, and Coding MoE Model Released Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B-parameter Mixture-of-Experts model (6B active per token) that unifies capabilities previously split across Magistral (reasoning), Pixtral (multimodal), and Devstral (coding agents) into a single open-weights model. The model features a 256k context window, configurable reasoning effort via a `reasoning_effort` parameter, native text and image input support, and is released under Apache 2.0. Mistral claims 40% latency reduction and 3x throughput improvement over Mistral Small 3, with benchmark results showing competitive performance against GPT-OSS 120B and Qwen models while producing significantly shorter outputs. The release includes day-0 availability as an NVIDIA NIM and support across vLLM, llama.cpp, SGLang, and Transformers.

Long Context Evolution Frontier Model Releases Mistral AI Mistral Small 4 Pixtral +14 more

7Mistral Ai News·19d ago·source ↗

Codestral 25.01: Mistral AI Releases Updated Coding Model with 2x Speed and Improved FIM Performance

Mistral AI has released Codestral 25.01, a significant upgrade to its Codestral coding model featuring a more efficient architecture and improved tokenizer that generates code approximately 2x faster than its predecessor. The model claims state-of-the-art performance for fill-in-the-middle (FIM) tasks across sub-100B parameter models, with a 256k context window and support for 80+ programming languages. Benchmarks show improvements over Codestral 2405 and competitive or superior results against DeepSeek Coder V2 lite and DeepSeek Coder 33B on HumanEval and FIM metrics. The model is available via Mistral's API, IDE plugins (VS Code, JetBrains via Continue), and for on-premises/VPC deployment, with cloud availability on Vertex AI and Azure AI Foundry.

Frontier Model Releases Evaluation and Benchmarking Mistral AI HumanEvalFIM Azure Foundry +12 more

8Mistral Ai News·19d ago·source ↗

Mistral AI Releases Mistral Large, Claims Second-Best API Model After GPT-4

Mistral AI has released Mistral Large, its most capable model to date, claiming second place among API-accessible models behind GPT-4 on standard benchmarks including MMLU, HellaSwag, and coding/math evals. The model features a 32K context window, native fluency in five European languages, function calling, and constrained output mode. Simultaneously, Mistral is launching a new Mistral Small optimized for latency, restructuring its endpoint lineup, and announcing Microsoft Azure as its first major distribution partner. This marks Mistral's first significant commercial partnership and expansion beyond its own infrastructure.

Long Context Evolution Frontier Model Releases Azure AI Studio Mistral AI Llama 2 70B +13 more

8Mistral Ai News·1mo ago·source ↗

Mistral Launches Medium 3.5 (128B Open Weights), Remote Cloud Coding Agents in Vibe, and Work Mode in Le Chat

Mistral AI has released Mistral Medium 3.5, a 128B dense open-weights model with a 256k context window, configurable reasoning effort, and a vision encoder trained from scratch, scoring 77.6% on SWE-Bench Verified. Alongside the model, Mistral is launching remote cloud-based coding agents in its Vibe CLI and Le Chat interface, enabling async parallel coding sessions that run independently and notify users on completion. A new Work mode in Le Chat provides a multi-step agentic interface for cross-tool workflows including email, calendar, research, and issue tracking. Mistral Medium 3.5 replaces Devstral 2 as the default model in both Le Chat and the Vibe CLI, and is available for self-hosting on as few as four GPUs under a modified MIT license.

Long Context Evolution Frontier Model Releases Mistral AI Qwen3.5 397B A17B Devstral 2 +10 more

7Mistral Ai News·19d ago·source ↗

Mistral Medium 3: Frontier-Class Performance at 8x Lower Cost

Mistral AI has released Mistral Medium 3, a new enterprise-focused language model priced at $0.4/$2 per million input/output tokens. The model claims to achieve 90%+ of Claude Sonnet 3.7's benchmark performance while undercutting cost leaders like DeepSeek v3, and outperforming open models including Llama 4 Maverick. It supports hybrid, on-premises, and in-VPC deployment on as few as four GPUs, and is available immediately on Mistral La Plateforme and Amazon SageMaker, with additional cloud platforms coming soon. The announcement also teases an upcoming large open-weights model release.

Frontier Model Releases Open Weights Progress Mistral AI Amazon SageMaker DeepSeek V4 +11 more

6Mistral Ai News·1mo ago·source ↗

Mistral Vibe 2.0: Terminal-Native Coding Agent with Custom Subagents and Devstral 2

Mistral AI has released Mistral Vibe 2.0, a major upgrade to its terminal-native coding agent, powered by the new Devstral 2 model family. Key additions include custom subagents for specialized tasks, multi-choice clarifications before execution, slash-command skills for preconfigured workflows, and unified agent modes. The product is available on Le Chat Pro and Team plans with pay-as-you-go credits, while Devstral 2 moves to paid API access at $0.40/M input and $2.00/M output tokens. Enterprise add-ons include fine-tuning on internal DSLs, reinforcement learning with custom environments, and end-to-end codebase modernization.

Inference Economics Enterprise Deployment Patterns Devstral 2 Small Mistral AI Devstral 2 +4 more