Almanac
← Events
5Mistral AI News·1mo ago

Pixtral 12B: Mistral AI's First Multimodal Model (Now Deprecated)

Mistral AI released Pixtral 12B in September 2024 as their first natively multimodal model, combining a new 400M parameter vision encoder trained from scratch with a 12B multimodal decoder based on Mistral Nemo. The model supports variable image sizes and aspect ratios, a 128K token context window for multiple images, and achieved 52.5% on MMMU while maintaining strong text-only benchmark performance. The model is now deprecated and has been replaced by newer vision and multimodal models from Mistral. It was released under Apache 2.0 license.

Related guides (4)

Related events (8)

7Mistral Ai News·1mo ago·source ↗

Pixtral Large: Mistral AI's 124B Open-Weights Multimodal Model

Mistral AI released Pixtral Large, a 124B open-weights multimodal model built on Mistral Large 2, featuring a 1B parameter vision encoder and 128K context window supporting at least 30 high-resolution images. The model claims state-of-the-art results on MathVista, DocVQA, and ChartQA, outperforming GPT-4o and Gemini-1.5 Pro on several benchmarks, and leads the LMSys Vision Leaderboard among open-weights models by ~50 ELO points. Simultaneously, Mistral updated its text model to Mistral Large 24.11 with improvements in long-context understanding, function calling, and RAG/agentic workflows. Note: the model has since been deprecated and replaced by newer Mistral vision models.

8Mistral Ai News·1mo ago·source ↗

Mistral Small 4: Unified Multimodal, Reasoning, and Coding MoE Model Released Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B-parameter Mixture-of-Experts model (6B active per token) that unifies capabilities previously split across Magistral (reasoning), Pixtral (multimodal), and Devstral (coding agents) into a single open-weights model. The model features a 256k context window, configurable reasoning effort via a `reasoning_effort` parameter, native text and image input support, and is released under Apache 2.0. Mistral claims 40% latency reduction and 3x throughput improvement over Mistral Small 3, with benchmark results showing competitive performance against GPT-OSS 120B and Qwen models while producing significantly shorter outputs. The release includes day-0 availability as an NVIDIA NIM and support across vLLM, llama.cpp, SGLang, and Transformers.

7Mistral Ai News·19d ago·source ↗

Mistral Small 3.1: Multimodal, 128k Context, Apache 2.0 Open-Weight Model

Mistral AI releases Mistral Small 3.1, a ~24B parameter model with multimodal understanding, 128k token context window, and claimed best-in-class performance among small models, outperforming Gemma 3 and GPT-4o Mini on text, multimodal, and multilingual benchmarks. The model runs on a single RTX 4090 or 32GB RAM Mac at 150 tokens/second and is released under Apache 2.0 license with both base and instruct checkpoints. It is available on HuggingFace, Mistral's La Plateforme API, and Google Cloud Vertex AI, with NVIDIA NIM and Azure AI Foundry support coming soon. The release targets enterprise and on-device use cases including document verification, agentic workflows, and domain fine-tuning.

7Mistral Ai News·19d ago·source ↗

Mistral AI Releases Mistral Small v24.09, Free API Tier, and Pixtral 12B Vision on le Chat with Broad Price Cuts

Mistral AI announced a multi-part release on September 17, 2024: a free tier for la Plateforme API, significant price reductions across its model family (up to 80% for Mistral Small and Codestral), an updated Mistral Small v24.09 (22B parameters, improved alignment and reasoning), and the availability of Pixtral 12B vision capabilities on le Chat. Pixtral 12B, released under Apache 2.0, supports images of any size without text performance degradation and is now accessible for free on le Chat. The pricing updates also apply to cloud partner deployments on Azure AI Studio, Amazon Bedrock, and Google Vertex AI.

7Mistral Ai News·19d ago·source ↗

Mistral NeMo: 12B Open-Weights Model with 128k Context, Built with NVIDIA

Mistral AI and NVIDIA jointly release Mistral NeMo, a 12B parameter model under Apache 2.0 license featuring a 128k token context window and a new tokenizer called Tekken based on Tiktoken. The model is designed as a drop-in replacement for Mistral 7B, supports multilingual applications across 11+ languages, and was trained with quantization awareness enabling FP8 inference without performance loss. Benchmark comparisons show competitive performance against Gemma 2 9B and Llama 3 8B. Weights are available on HuggingFace and the model is also packaged as an NVIDIA NIM inference microservice.

8Mistral Ai News·19d ago·source ↗

Mistral Large 2 (123B): New Frontier Model with 128k Context, Multilingual and Code Capabilities

Mistral AI releases Mistral Large 2, a 123-billion-parameter model with a 128k context window, supporting 80+ coding languages and over a dozen natural languages. The model claims competitive performance with GPT-4o, Claude 3 Opus, and Llama 3 405B on code generation, reasoning, and multilingual benchmarks, while targeting cost-efficient single-node inference. Weights are available under a Mistral Research License for non-commercial use, with a commercial license required for self-deployment. The model is accessible via Mistral's la Plateforme API (mistral-large-2407), HuggingFace, and Google Cloud Vertex AI.

7Mistral Ai News·19d ago·source ↗

Mistral Small 3: 24B Latency-Optimized Open-Weight Model Released Under Apache 2.0

Mistral AI has released Mistral Small 3, a 24B-parameter instruction-tuned model optimized for low latency, achieving over 81% on MMLU at 150 tokens/s on a single GPU. The model is competitive with Llama 3.3 70B and Qwen 32B while being more than 3x faster on equivalent hardware, and is released under Apache 2.0 for both pretrained and instruction-tuned checkpoints. It is explicitly not trained with RL or synthetic data, positioning it as a base model for community fine-tuning and reasoning capability development. Deployment targets include local inference on consumer hardware (RTX 4090, MacBook 32GB RAM), agentic function calling, and domain-specific fine-tuning.

9Mistral Ai News·19d ago·source ↗

Mixtral 8x7B: Mistral AI Releases Sparse Mixture-of-Experts Open-Weight Model

Mistral AI has released Mixtral 8x7B, a sparse mixture-of-experts (SMoE) model with 46.7B total parameters but only 12.9B active parameters per token, enabling inference speed and cost equivalent to a 12.9B model. Licensed under Apache 2.0, Mixtral outperforms Llama 2 70B on most benchmarks and matches or exceeds GPT-3.5, with support for 32k context, five European languages, and strong code generation. An instruction-tuned variant (Mixtral 8x7B Instruct) achieves 8.3 on MT-Bench, claimed best among open-source models at release. The model is deployed behind Mistral's mistral-small API endpoint and supported via vLLM with Megablocks CUDA kernels.