Entity · model

Mistral 7B

modelactivemistral-7b-f29b35e3·11 events·first seen May 18, 2026

Aliases: Mistral 7B, Mistral-7B

Co-occurring entities

More like this (12)

Mistral 7B Instruct v0.2 Mistral-7B-v0.3 Mistral Small 4 Mistral 3.1 Mistral AI Mistral-medium Mistral Mistral Large 24.11 Mistral Large 2 Mistral Nemo Mistral Next mistral-medium-latest

Recent events (11)

4arXiv · cs.CL·Jul 9, 2026·source ↗

PALS: Percentile-aware per-layer sparsity improves LLM pruning on LLaMA-2 but not universally

PALS (Percentile-Aware Layerwise Sparsity) is a one-shot pruning method that assigns per-layer sparsity ratios based on the 99th percentile of activation magnitudes, bounded within ±5% of a target ratio. On LLaMA-2-7B at 50% sparsity, PALS achieves perplexity of 10.96 vs. 12.92 for uniform Wanda, a statistically significant improvement requiring no fine-tuning. However, gains are architecture-dependent: LLaMA-3-8B shows marginal improvement and Mistral-7B shows none. A notable negative finding is that gradient-based allocation performs worse than random, suggesting gradient magnitude is a poor proxy for the impact of discrete weight removal.

Open Weights Progress Inference Economics PALS WikiText-2 LLaMA-7B +5 more

8Mistral Ai News·Jun 1, 2026·source ↗

Mistral AI Releases Mixtral 8x22B Under Apache 2.0

Mistral AI has released Mixtral 8x22B, a sparse Mixture-of-Experts model with 141B total parameters but only 39B active parameters, under the permissive Apache 2.0 license. The model features a 64K token context window, native function calling, multilingual support across five European languages, and strong math and coding performance. Mistral claims it outperforms all other open-weight models on standard benchmarks while being faster than dense 70B models due to sparse activation. An instructed version achieves 90.8% on GSM8K maj@8.

Frontier Model Releases Open Weights Progress Mistral AI Llama 2 70B Apache 2.0 +10 more

7Mistral Ai News·Jun 1, 2026·source ↗

Mistral NeMo: 12B Open-Weights Model with 128k Context, Built with NVIDIA

Mistral AI and NVIDIA jointly release Mistral NeMo, a 12B parameter model under Apache 2.0 license featuring a 128k token context window and a new tokenizer called Tekken based on Tiktoken. The model is designed as a drop-in replacement for Mistral 7B, supports multilingual applications across 11+ languages, and was trained with quantization awareness enabling FP8 inference without performance loss. Benchmark comparisons show competitive performance against Gemma 2 9B and Llama 3 8B. Weights are available on HuggingFace and the model is also packaged as an NVIDIA NIM inference microservice.

Long Context Evolution Frontier Model Releases Mistral AI Gemma 2 9B Apache 2.0 +9 more

7Mistral Ai News·Jun 1, 2026·source ↗

Mistral AI Releases Ministral 3B and 8B Edge Models

Mistral AI has introduced two new small language models, Ministral 3B and Ministral 8B, targeting on-device and edge computing use cases. Both models support up to 128k context length and claim state-of-the-art performance in the sub-10B parameter category, outperforming comparable models from Google and Meta on internal benchmarks. Ministral 8B features an interleaved sliding-window attention mechanism for memory-efficient inference and is priced at $0.1/M tokens via API, while Ministral 3B is priced at $0.04/M tokens. Weights for Ministral 8B Instruct are available for research use, with commercial licensing available on request.

Long Context Evolution Frontier Model Releases Mistral AI Gemma 2 9B Ministral 8B +12 more

6Mistral Ai News·Jun 1, 2026·source ↗

Mistral AI Releases Mathstral 7B: Math-Specialized Model with SOTA Reasoning in Size Category

Mistral AI has released Mathstral 7B, a math and STEM-specialized model built on Mistral 7B, developed in collaboration with Project Numina. The model achieves 56.6% on MATH and 63.47% on MMLU in standard evaluation, improving to 74.59% on MATH with a reward model over 64 candidates using inference-time compute scaling. Weights are open on HuggingFace and compatible with mistral-inference and mistral-finetune tooling.

Frontier Model Releases Evaluation and Benchmarking Mistral AI Mathstral 7B Project Numina +8 more

6Mistral Ai News·Jun 1, 2026·source ↗

Mistral AI Launches Model Customization Suite: Open-Source SDK, Managed Fine-Tuning, and Custom Training

Mistral AI has introduced three tiers of model customization on la Plateforme: an open-source LoRA-based fine-tuning SDK (mistral-finetune) for self-hosted use, serverless managed fine-tuning services via API initially supporting Mistral 7B and Mistral Small, and bespoke custom training services including continuous pretraining for enterprise customers. The managed fine-tuning uses LoRA adapters and claims cost and efficiency advantages over full fine-tuning while maintaining comparable performance. This positions Mistral as a full-stack customization provider competing with OpenAI's fine-tuning API and similar offerings.

Open Weights Progress Inference Economics Mistral AI Mistral Small 4 mistral-finetune +6 more

8Mistral Ai News·Jun 1, 2026·source ↗

Mistral 7B: Open-Weights 7B Model Outperforming Llama 2 13B

Mistral AI released Mistral 7B, a 7.3B parameter language model under the Apache 2.0 license that outperforms Llama 2 13B across all evaluated benchmarks and approaches Llama 34B on many tasks. The model employs Grouped-Query Attention (GQA) for faster inference and Sliding Window Attention (SWA) to handle longer sequences at reduced cost, achieving roughly 2x speed improvement at 16k sequence length. A fine-tuned chat variant, Mistral 7B Instruct, outperforms all 7B chat models on MT-Bench and is competitive with 13B-class chat models. The release includes deployment support for AWS, GCP, Azure, HuggingFace, and local use via vLLM.

Long Context Evolution Frontier Model Releases Mistral AI MT-Bench Mistral 7B Instruct v0.2 +13 more

7Mistral Ai News·Jun 1, 2026·source ↗

Mistral AI Founding Manifesto and Mistral 7B Release

Mistral AI published its founding mission statement alongside the release of Mistral 7B, a 7-billion-parameter open-weights language model released under Apache 2.0. The model claims to outperform all available open models up to 13B parameters on standard English and code benchmarks, produced in three months from a standing start. The post articulates Mistral's strategic thesis: open-weight models will outcompete proprietary black-box APIs for most enterprise use cases, drawing analogies to Linux, WebKit, and Kubernetes. The company signals intent to release progressively larger frontier models while building a commercial offering around on-premise and VPC deployment.

Frontier Model Releases Open Weights Progress Mistral AI Apache 2.0 DeepMind +8 more

3Hugging Face Blog·May 19, 2026·source ↗

Comparing RoBERTa, Llama 2, and Mistral for Sequence Classification via LoRA on Disaster Tweets

A Hugging Face blog post benchmarks three models—RoBERTa, Llama 2, and Mistral—on a disaster tweet classification task using LoRA fine-tuning. The analysis compares parameter-efficient adaptation of encoder-only versus decoder-only architectures for a practical NLP classification problem. Results provide practitioners with guidance on model selection and LoRA configuration for sequence classification.

Open Weights Progress Agent and Tool Ecosystem RoBERTa LoRA Llama 2 +2 more

5Hugging Face Blog·May 19, 2026·source ↗

WWDC 24: Running Mistral 7B with Core ML

This Hugging Face blog post covers running Mistral 7B on Apple devices using Core ML, likely demonstrated or announced around WWDC 2024. It addresses on-device inference of a 7B parameter open-weights model using Apple's ML framework. This represents a practical deployment pattern for running capable open-weights LLMs locally on Apple Silicon hardware.

Open Weights Progress Inference Economics Mistral AI WWDC 2024 Mistral 7B +4 more

6Qwen Research·May 18, 2026·source ↗

Qwen1.5-MoE: Matching 7B Model Performance with 1/3 Activated Parameters

Alibaba's Qwen team releases Qwen1.5-MoE-A2.7B, a mixture-of-experts model with only 2.7 billion activated parameters that claims performance parity with 7B dense models such as Mistral 7B and Qwen1.5-7B. The model activates roughly one-third of its total parameters during inference, offering significant compute efficiency gains. This release follows growing industry interest in MoE architectures sparked by Mixtral, and the model is available on GitHub, HuggingFace, and ModelScope.

Frontier Model Releases Open Weights Progress Mixtral Qwen1.5-MoE-A2.7B Qwen1.5-7B +6 more