8Mistral AI News·19d ago

Mistral AI Releases Voxtral: Open-Weight Speech Understanding Models in 24B and 3B Sizes

Mistral AI has released Voxtral, a family of two open-weight speech understanding models (Voxtral Small at 24B and Voxtral Mini at 3B) under the Apache 2.0 license. Both models support long-form audio up to 30-40 minutes, native multilingual transcription, built-in Q&A and summarization, and function-calling directly from voice, built on the Mistral Small 3.1 language model backbone. Benchmarks show Voxtral outperforms Whisper large-v3 across all tasks and is competitive with GPT-4o mini and Gemini 2.5 Flash on audio understanding, while pricing starts at $0.001/minute via API. Models are available on Hugging Face and through Mistral's API, with a transcription-optimized variant (Voxtral Mini Transcribe) also offered.

Frontier Model Releases Open Weights Progress Inference Economics Agent and Tool Ecosystem Multimodal Progress Mistral AI FLEURS Mistral Small 4 Apache 2.0 Voxtral Mini Transcribe V2 Voxtral Mini ElevenLabs Scribe v2 Voxtral Small Le Chat Gemini-2.5-Flash-Lite Whisper large-v3 GPT-4o mini Transcribe Hugging Face Mozilla Common Voice

Related guides (4)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Mistral AI

Mistral AI: Europe's Open-Weight Frontier Lab

Read asIn-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner

Related events (8)

7Mistral Ai News·1mo ago·source ↗

Mistral Releases Voxtral Transcribe 2: State-of-the-Art Speech-to-Text with Sub-200ms Realtime Model

Mistral AI has released Voxtral Transcribe 2, a family of two speech-to-text models: Voxtral Mini Transcribe V2 for batch transcription and Voxtral Realtime for live applications. Voxtral Realtime features a novel streaming architecture with configurable latency down to sub-200ms, a 4B parameter footprint suitable for edge deployment, and is released as open weights under Apache 2.0. Voxtral Mini Transcribe V2 claims state-of-the-art word error rate on FLEURS at $0.003/min, outperforming GPT-4o mini Transcribe, Gemini 2.5 Flash, AssemblyAI, and Deepgram Nova on accuracy benchmarks. Both models support 13 languages with speaker diarization, word-level timestamps, and context biasing.

Open Weights Progress Inference Economics Mistral AI FLEURS Apache 2.0 +11 more

7Mistral Ai News·1mo ago·source ↗

Mistral Releases Voxtral TTS: 4B-Parameter Multilingual Text-to-Speech Model

Mistral AI has launched Voxtral TTS, its first text-to-speech model, built on a 4B-parameter transformer-based autoregressive flow-matching architecture derived from Ministral 3B. The model supports 9 languages with zero-shot voice adaptation from as little as 3 seconds of reference audio, achieving 70ms latency for typical inputs and a real-time factor of ~9.7x. Human evaluations claim superior naturalness compared to ElevenLabs Flash v2.5 and parity with ElevenLabs v3. The model is available via Mistral Studio and API, targeting enterprise voice agent workflows.

Inference Economics Enterprise Deployment Patterns ElevenLabs Flash v2.5 Mistral AI ElevenLabs v3 +5 more

7Mistral Ai News·19d ago·source ↗

Mistral Small 3.1: Multimodal, 128k Context, Apache 2.0 Open-Weight Model

Mistral AI releases Mistral Small 3.1, a ~24B parameter model with multimodal understanding, 128k token context window, and claimed best-in-class performance among small models, outperforming Gemma 3 and GPT-4o Mini on text, multimodal, and multilingual benchmarks. The model runs on a single RTX 4090 or 32GB RAM Mac at 150 tokens/second and is released under Apache 2.0 license with both base and instruct checkpoints. It is available on HuggingFace, Mistral's La Plateforme API, and Google Cloud Vertex AI, with NVIDIA NIM and Azure AI Foundry support coming soon. The release targets enterprise and on-device use cases including document verification, agentic workflows, and domain fine-tuning.

Long Context Evolution Frontier Model Releases Mistral AI Mistral Small 4 MT-Bench +12 more

7Mistral Ai News·19d ago·source ↗

Mistral Medium 3: Frontier-Class Performance at 8x Lower Cost

Mistral AI has released Mistral Medium 3, a new enterprise-focused language model priced at $0.4/$2 per million input/output tokens. The model claims to achieve 90%+ of Claude Sonnet 3.7's benchmark performance while undercutting cost leaders like DeepSeek v3, and outperforming open models including Llama 4 Maverick. It supports hybrid, on-premises, and in-VPC deployment on as few as four GPUs, and is available immediately on Mistral La Plateforme and Amazon SageMaker, with additional cloud platforms coming soon. The announcement also teases an upcoming large open-weights model release.

Frontier Model Releases Open Weights Progress Mistral AI Amazon SageMaker DeepSeek V4 +11 more

8Mistral Ai News·1mo ago·source ↗

Mistral Small 4: Unified Multimodal, Reasoning, and Coding MoE Model Released Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B-parameter Mixture-of-Experts model (6B active per token) that unifies capabilities previously split across Magistral (reasoning), Pixtral (multimodal), and Devstral (coding agents) into a single open-weights model. The model features a 256k context window, configurable reasoning effort via a `reasoning_effort` parameter, native text and image input support, and is released under Apache 2.0. Mistral claims 40% latency reduction and 3x throughput improvement over Mistral Small 3, with benchmark results showing competitive performance against GPT-OSS 120B and Qwen models while producing significantly shorter outputs. The release includes day-0 availability as an NVIDIA NIM and support across vLLM, llama.cpp, SGLang, and Transformers.

Long Context Evolution Frontier Model Releases Mistral AI Mistral Small 4 Pixtral +14 more

8Mistral Ai News·19d ago·source ↗

Mistral Large 2 (123B): New Frontier Model with 128k Context, Multilingual and Code Capabilities

Mistral AI releases Mistral Large 2, a 123-billion-parameter model with a 128k context window, supporting 80+ coding languages and over a dozen natural languages. The model claims competitive performance with GPT-4o, Claude 3 Opus, and Llama 3 405B on code generation, reasoning, and multilingual benchmarks, while targeting cost-efficient single-node inference. Weights are available under a Mistral Research License for non-commercial use, with a commercial license required for self-deployment. The model is accessible via Mistral's la Plateforme API (mistral-large-2407), HuggingFace, and Google Cloud Vertex AI.

Long Context Evolution Frontier Model Releases Mistral AI MT-Bench Claude Opus 4.6 +14 more

7Mistral Ai News·19d ago·source ↗

Mistral Small 3: 24B Latency-Optimized Open-Weight Model Released Under Apache 2.0

Mistral AI has released Mistral Small 3, a 24B-parameter instruction-tuned model optimized for low latency, achieving over 81% on MMLU at 150 tokens/s on a single GPU. The model is competitive with Llama 3.3 70B and Qwen 32B while being more than 3x faster on equivalent hardware, and is released under Apache 2.0 for both pretrained and instruction-tuned checkpoints. It is explicitly not trained with RL or synthetic data, positioning it as a base model for community fine-tuning and reasoning capability development. Deployment targets include local inference on consumer hardware (RTX 4090, MacBook 32GB RAM), agentic function calling, and domain-specific fine-tuning.

Frontier Model Releases Open Weights Progress Mistral AI Mistral Small 4 Ollama +12 more

7Mistral Ai News·1mo ago·source ↗

Pixtral Large: Mistral AI's 124B Open-Weights Multimodal Model

Mistral AI released Pixtral Large, a 124B open-weights multimodal model built on Mistral Large 2, featuring a 1B parameter vision encoder and 128K context window supporting at least 30 high-resolution images. The model claims state-of-the-art results on MathVista, DocVQA, and ChartQA, outperforming GPT-4o and Gemini-1.5 Pro on several benchmarks, and leads the LMSys Vision Leaderboard among open-weights models by ~50 ELO points. Simultaneously, Mistral updated its text model to Mistral Large 24.11 with improvements in long-context understanding, function calling, and RAG/agentic workflows. Note: the model has since been deprecated and replaced by newer Mistral vision models.

Frontier Model Releases Evaluation and Benchmarking Google Cloud Mistral AI MT-Bench +15 more