Almanac
paper

Expert-Aware Causal Tracing of Factual Recall in Sparse MoE Language Models

paperactiveprovisionalexpert-aware-causal-tracing-of-factual-recall-in-sparse-moe-language-models-255ffff8·1 events·first seen 13d ago

Aliases: Expert-Aware Causal Tracing of Factual Recall in Sparse MoE Language Models

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·13d ago·source ↗

Expert-aware causal tracing of factual recall in sparse MoE language models

A new arXiv preprint extends causal tracing methodology to sparse mixture-of-experts (MoE) language models, asking which routed experts mediate factual recall rather than just which layers or feed-forward modules. Using CounterFact facts, the authors apply noise-corruption and clean-patch interventions to Qwen3-30B-A3B-Base and Mixtral-8x7B-v0.1, finding that expert-level localization is possible in the former (a single expert at layer 44) but requires multi-expert coalition recovery in the latter. The results indicate that factual localization in MoE models is model- and protocol-dependent rather than universal.