Entity · paper

From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models

paperactivefrom-observation-to-intervention-a-causal-audit-of-expert-importance-in-mixture-of-experts-models-9fa736eb·1 events·first seen Jun 10, 2026

Aliases: From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models

Co-occurring entities

OLMoE-1B-7B-0924 Qwen1.5-MoE-A2.7B DeepSeek Coder V2 lite

More like this (12)

Sparse Mixture-of-Experts Mixture of Experts Mixture of Efficient Experts Toward Calibrated Mixture-of-Experts Under Distribution Shift Tying the Loop -- Tied Expert Layers in Mixture-of-Experts Language Models Beyond Third-Person Audits: Situated Interaction Auditing for User-Centered LLM Bias Research Expert-Aware Causal Tracing of Factual Recall in Sparse MoE Language Models Expert Blindness Effect Relaxing Faithfulness with Intervention-Only Causal Discovery Redesign Mixture-of-Experts Routers with Manifold Power Iteration Reason-Mediated Behavioral Models for Auditing LLM Social Simulators Same Evidence, Different Answer: Auditing Order Sensitivity in Multimodal Large Language Models

Recent events (1)

6arXiv · cs.CL·Jun 10, 2026·source ↗

Causal audit finds routing statistics do not predict expert importance in MoE pruning

A new arXiv paper conducts a token-level interventional audit of Mixture-of-Experts (MoE) pruning heuristics across three architectures (OLMoE-1B-7B, Qwen1.5-MoE, DeepSeek-V2-Lite), finding that no standard observational metric — utilization rates, activation norms, routing weight distributions — reliably predicts which experts can be removed without functional cost. Effect sizes fall below Cohen's d = 0.17 across all 60 metric-layer combinations after multiple-comparison correction, with only a single significant signal at OLMoE's final layer. The authors argue that existing pruning methods succeed not because they identify dispensable experts but because early-layer redundancy makes most selection criteria interchangeable. The work frames this as a concrete counterexample to the broader interpretability practice of treating associational (rung-1) evidence as interventional (rung-2) conclusions.

Evaluation and Benchmarking Inference Economics OLMoE-1B-7B-0924 From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models Qwen1.5-MoE-A2.7B +2 more