Entity · model

DeepSeek Coder V2 lite

modelactivedeepseek-coder-v2-lite-dc10a5a1·2 events·first seen Jun 1, 2026

Aliases: DeepSeek Coder V2 lite, DeepSeek-V2-Lite

Co-occurring entities

OLMoE-1B-7B-0924 From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models Qwen1.5-MoE-A2.7B Mistral AI HumanEvalFIM Azure Foundry Google Cloud Vertex AI LiveCodeBench Codestral 2405 Codestral 25.08 LMsys Copilot Arena deepseek-coder Ty Dunn HumanEval Continue

More like this (12)

DeepSeek-Coder-V1-6.7B deepseek-coder DeepSeek-Coder-V2-0724 DeepSeek-Prover-V2-7B DeepSeek-V4-Pro Preview DeepSeek-Math-V2 DeepSeek-V2.5-1210 DeepSeek-V3.1-Base DeepSeek-OCR-2 DeepSeek V4 DeepSeek-R1-Lite-Preview DeepSeek-V4-Flash Preview

Recent events (2)

6arXiv · cs.CL·Jun 10, 2026·source ↗

Causal audit finds routing statistics do not predict expert importance in MoE pruning

A new arXiv paper conducts a token-level interventional audit of Mixture-of-Experts (MoE) pruning heuristics across three architectures (OLMoE-1B-7B, Qwen1.5-MoE, DeepSeek-V2-Lite), finding that no standard observational metric — utilization rates, activation norms, routing weight distributions — reliably predicts which experts can be removed without functional cost. Effect sizes fall below Cohen's d = 0.17 across all 60 metric-layer combinations after multiple-comparison correction, with only a single significant signal at OLMoE's final layer. The authors argue that existing pruning methods succeed not because they identify dispensable experts but because early-layer redundancy makes most selection criteria interchangeable. The work frames this as a concrete counterexample to the broader interpretability practice of treating associational (rung-1) evidence as interventional (rung-2) conclusions.

Evaluation and Benchmarking Inference Economics OLMoE-1B-7B-0924 From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models Qwen1.5-MoE-A2.7B +2 more

7Mistral Ai News·Jun 1, 2026·source ↗

Codestral 25.01: Mistral AI Releases Updated Coding Model with 2x Speed and Improved FIM Performance

Mistral AI has released Codestral 25.01, a significant upgrade to its Codestral coding model featuring a more efficient architecture and improved tokenizer that generates code approximately 2x faster than its predecessor. The model claims state-of-the-art performance for fill-in-the-middle (FIM) tasks across sub-100B parameter models, with a 256k context window and support for 80+ programming languages. Benchmarks show improvements over Codestral 2405 and competitive or superior results against DeepSeek Coder V2 lite and DeepSeek Coder 33B on HumanEval and FIM metrics. The model is available via Mistral's API, IDE plugins (VS Code, JetBrains via Continue), and for on-premises/VPC deployment, with cloud availability on Vertex AI and Azure AI Foundry.

Frontier Model Releases Evaluation and Benchmarking Mistral AI HumanEvalFIM Azure Foundry +12 more