Mistral Releases Voxtral TTS: 4B-Parameter Multilingual Text-to-Speech Model
Mistral AI has launched Voxtral TTS, its first text-to-speech model, built on a 4B-parameter transformer-based autoregressive flow-matching architecture derived from Ministral 3B. The model supports 9 languages with zero-shot voice adaptation from as little as 3 seconds of reference audio, achieving 70ms latency for typical inputs and a real-time factor of ~9.7x. Human evaluations claim superior naturalness compared to ElevenLabs Flash v2.5 and parity with ElevenLabs v3. The model is available via Mistral Studio and API, targeting enterprise voice agent workflows.
Related guides (5)

Enterprise Deployment PatternsTopic guide
Enterprise Deployment Patterns: From LLM Demo to Production Reality
Related events (8)
Mistral AI Releases Voxtral: Open-Weight Speech Understanding Models in 24B and 3B Sizes
Mistral AI has released Voxtral, a family of two open-weight speech understanding models (Voxtral Small at 24B and Voxtral Mini at 3B) under the Apache 2.0 license. Both models support long-form audio up to 30-40 minutes, native multilingual transcription, built-in Q&A and summarization, and function-calling directly from voice, built on the Mistral Small 3.1 language model backbone. Benchmarks show Voxtral outperforms Whisper large-v3 across all tasks and is competitive with GPT-4o mini and Gemini 2.5 Flash on audio understanding, while pricing starts at $0.001/minute via API. Models are available on Hugging Face and through Mistral's API, with a transcription-optimized variant (Voxtral Mini Transcribe) also offered.
Mistral Releases Voxtral Transcribe 2: State-of-the-Art Speech-to-Text with Sub-200ms Realtime Model
Mistral AI has released Voxtral Transcribe 2, a family of two speech-to-text models: Voxtral Mini Transcribe V2 for batch transcription and Voxtral Realtime for live applications. Voxtral Realtime features a novel streaming architecture with configurable latency down to sub-200ms, a 4B parameter footprint suitable for edge deployment, and is released as open weights under Apache 2.0. Voxtral Mini Transcribe V2 claims state-of-the-art word error rate on FLEURS at $0.003/min, outperforming GPT-4o mini Transcribe, Gemini 2.5 Flash, AssemblyAI, and Deepgram Nova on accuracy benchmarks. Both models support 13 languages with speaker diarization, word-level timestamps, and context biasing.
Mistral Small 3.1: Multimodal, 128k Context, Apache 2.0 Open-Weight Model
Mistral AI releases Mistral Small 3.1, a ~24B parameter model with multimodal understanding, 128k token context window, and claimed best-in-class performance among small models, outperforming Gemma 3 and GPT-4o Mini on text, multimodal, and multilingual benchmarks. The model runs on a single RTX 4090 or 32GB RAM Mac at 150 tokens/second and is released under Apache 2.0 license with both base and instruct checkpoints. It is available on HuggingFace, Mistral's La Plateforme API, and Google Cloud Vertex AI, with NVIDIA NIM and Azure AI Foundry support coming soon. The release targets enterprise and on-device use cases including document verification, agentic workflows, and domain fine-tuning.
Mistral Small 4: Unified Multimodal, Reasoning, and Coding MoE Model Released Under Apache 2.0
Mistral AI has released Mistral Small 4, a 119B-parameter Mixture-of-Experts model (6B active per token) that unifies capabilities previously split across Magistral (reasoning), Pixtral (multimodal), and Devstral (coding agents) into a single open-weights model. The model features a 256k context window, configurable reasoning effort via a `reasoning_effort` parameter, native text and image input support, and is released under Apache 2.0. Mistral claims 40% latency reduction and 3x throughput improvement over Mistral Small 3, with benchmark results showing competitive performance against GPT-OSS 120B and Qwen models while producing significantly shorter outputs. The release includes day-0 availability as an NVIDIA NIM and support across vLLM, llama.cpp, SGLang, and Transformers.
Mistral AI Releases Mistral Large, Claims Second-Best API Model After GPT-4
Mistral AI has released Mistral Large, its most capable model to date, claiming second place among API-accessible models behind GPT-4 on standard benchmarks including MMLU, HellaSwag, and coding/math evals. The model features a 32K context window, native fluency in five European languages, function calling, and constrained output mode. Simultaneously, Mistral is launching a new Mistral Small optimized for latency, restructuring its endpoint lineup, and announcing Microsoft Azure as its first major distribution partner. This marks Mistral's first significant commercial partnership and expansion beyond its own infrastructure.
Mistral Medium 3: Frontier-Class Performance at 8x Lower Cost
Mistral AI has released Mistral Medium 3, a new enterprise-focused language model priced at $0.4/$2 per million input/output tokens. The model claims to achieve 90%+ of Claude Sonnet 3.7's benchmark performance while undercutting cost leaders like DeepSeek v3, and outperforming open models including Llama 4 Maverick. It supports hybrid, on-premises, and in-VPC deployment on as few as four GPUs, and is available immediately on Mistral La Plateforme and Amazon SageMaker, with additional cloud platforms coming soon. The announcement also teases an upcoming large open-weights model release.
Mistral Large 2 (123B): New Frontier Model with 128k Context, Multilingual and Code Capabilities
Mistral AI releases Mistral Large 2, a 123-billion-parameter model with a 128k context window, supporting 80+ coding languages and over a dozen natural languages. The model claims competitive performance with GPT-4o, Claude 3 Opus, and Llama 3 405B on code generation, reasoning, and multilingual benchmarks, while targeting cost-efficient single-node inference. Weights are available under a Mistral Research License for non-commercial use, with a commercial license required for self-deployment. The model is accessible via Mistral's la Plateforme API (mistral-large-2407), HuggingFace, and Google Cloud Vertex AI.
Mistral AI Founding Manifesto and Mistral 7B Release
Mistral AI published its founding mission statement alongside the release of Mistral 7B, a 7-billion-parameter open-weights language model released under Apache 2.0. The model claims to outperform all available open models up to 13B parameters on standard English and code benchmarks, produced in three months from a standing start. The post articulates Mistral's strategic thesis: open-weight models will outcompete proprietary black-box APIs for most enterprise use cases, drawing analogies to Linux, WebKit, and Kubernetes. The company signals intent to release progressively larger frontier models while building a commercial offering around on-premise and VPC deployment.



