Almanac
paper

Toward Calibrated Mixture-of-Experts Under Distribution Shift

paperactiveprovisionaltoward-calibrated-mixture-of-experts-under-distribution-shift-2350fa95·1 events·first seen 47h ago

Aliases: Toward Calibrated Mixture-of-Experts Under Distribution Shift

More like this (12)

Recent events (1)

4arXiv · cs.AI·47h ago·source ↗

Calibrated Mixture-of-Experts under distribution shift: adversarial reweighting approach

A new arXiv preprint analyzes how mixture-of-experts (MoE) models maintain calibration under distribution shift, examining the interaction between routing mechanisms and expert-level calibration. The authors prove that expert calibration is sufficient for overall model calibration in hard-routed MoE but insufficient for soft-routed variants. To address the soft-routing gap, they propose an adversarial reweighting method that penalizes calibration errors of the routed aggregate under distribution shift, demonstrating improved accuracy-calibration tradeoffs across model classes and tasks.