Almanac
paper

From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models

paperactiveprovisionalfrom-observation-to-intervention-a-causal-audit-of-expert-importance-in-mixture-of-experts-models-9fa736eb·1 events·first seen 7d ago

Aliases: From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·7d ago·source ↗

Causal audit finds routing statistics do not predict expert importance in MoE pruning

A new arXiv paper conducts a token-level interventional audit of Mixture-of-Experts (MoE) pruning heuristics across three architectures (OLMoE-1B-7B, Qwen1.5-MoE, DeepSeek-V2-Lite), finding that no standard observational metric — utilization rates, activation norms, routing weight distributions — reliably predicts which experts can be removed without functional cost. Effect sizes fall below Cohen's d = 0.17 across all 60 metric-layer combinations after multiple-comparison correction, with only a single significant signal at OLMoE's final layer. The authors argue that existing pruning methods succeed not because they identify dispensable experts but because early-layer redundancy makes most selection criteria interchangeable. The work frames this as a concrete counterexample to the broader interpretability practice of treating associational (rung-1) evidence as interventional (rung-2) conclusions.