Entity · model

OLMoE-1B-7B-0924

modelactiveolmoe-1b-7b-0924-612c42bd·1 events·first seen Jun 10, 2026

Aliases: OLMoE-1B-7B-0924

Co-occurring entities

From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models Qwen1.5-MoE-A2.7B DeepSeek Coder V2 lite

More like this (12)

OLMoE-1B-7B OLMo-1B OLMo-3 OLMoE LFM2-8B-A1B LLaMA-2-13B MoE²-LoRA GLM-Z1-9B-0414 OLMo2 LLaVA-1.5-7B EuroLLM-9B LLaVA-1.5-13B

Recent events (1)

6arXiv · cs.CL·Jun 10, 2026·source ↗

Causal audit finds routing statistics do not predict expert importance in MoE pruning

A new arXiv paper conducts a token-level interventional audit of Mixture-of-Experts (MoE) pruning heuristics across three architectures (OLMoE-1B-7B, Qwen1.5-MoE, DeepSeek-V2-Lite), finding that no standard observational metric — utilization rates, activation norms, routing weight distributions — reliably predicts which experts can be removed without functional cost. Effect sizes fall below Cohen's d = 0.17 across all 60 metric-layer combinations after multiple-comparison correction, with only a single significant signal at OLMoE's final layer. The authors argue that existing pruning methods succeed not because they identify dispensable experts but because early-layer redundancy makes most selection criteria interchangeable. The work frames this as a concrete counterexample to the broader interpretability practice of treating associational (rung-1) evidence as interventional (rung-2) conclusions.

Evaluation and Benchmarking Inference Economics OLMoE-1B-7B-0924 From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models Qwen1.5-MoE-A2.7B +2 more