model
MobileMoE
modelactiveprovisional
mobilemoe-8592fc8c·1 events·first seen 21d agoAliases: MobileMoE
Co-occurring entities
More like this (12)
Recent events (1)
MobileMoE: Scaling Mixture-of-Experts for Sub-Billion Parameter On-Device Deployment
MobileMoE introduces a family of on-device MoE language models with 0.3–0.9B active parameters and 1.3–5.3B total parameters, targeting mobile deployment under memory and compute constraints. The authors derive an on-device MoE scaling law identifying a sweet spot of moderate sparsity with fine-grained and shared experts, then train models through a four-stage recipe including quantization-aware training on open-source data. Across 14 benchmarks, MobileMoE matches or exceeds leading dense on-device LLMs with 2–4× fewer inference FLOPs, and delivers 1.8–3.8× faster prefill and 2.2–3.4× faster decode than dense baselines on commodity smartphones at comparable INT4 memory.