technique
SARA
techniqueactiveprovisional
sara-efdab9e5·1 events·first seen 10h agoAliases: SARA
Co-occurring entities
More like this (12)
Recent events (1)
SARA framework aligns MoE routing distributions to improve low-resource multilingual performance
Researchers introduce SARA (Semantically Anchored Routing Alignment), a framework that addresses cross-lingual routing divergence in sparse Mixture-of-Experts LLMs by aligning the internal routing distributions of low-resource language tokens to match those of high-resource semantic anchors via symmetric Jensen-Shannon divergence constraints. Unlike logit-level distillation, SARA operates directly on MoE routing layers to encourage mechanistic consistency in expert selection across languages. Experiments on Qwen3-30B-A3B and Phi-3.5-MoE-instruct across 5 low-resource languages show modest but consistent gains (up to +1.2%) on Global-MMLU over standard instruction tuning.