Entity · model

MobileMoE

modelactivemobilemoe-8592fc8c·1 events·first seen May 27, 2026

Aliases: MobileMoE

Co-occurring entities

MobileLLM-Pro OLMoE-1B-7B INT4 Quantization Mixture of Experts on-device MoE scaling law quantization-aware training

More like this (12)

OLMoE SegMoE AnyMo LatentMoE MoE²-LoRA Qwen3.5 MoE Stable LatentMoE Localized LoRA-MoE on-device MoE scaling law MoCA MOSS MOJO

Recent events (1)

7arXiv · cs.CL·May 27, 2026·source ↗

MobileMoE: Scaling Mixture-of-Experts for Sub-Billion Parameter On-Device Deployment

MobileMoE introduces a family of on-device MoE language models with 0.3–0.9B active parameters and 1.3–5.3B total parameters, targeting mobile deployment under memory and compute constraints. The authors derive an on-device MoE scaling law identifying a sweet spot of moderate sparsity with fine-grained and shared experts, then train models through a four-stage recipe including quantization-aware training on open-source data. Across 14 benchmarks, MobileMoE matches or exceeds leading dense on-device LLMs with 2–4× fewer inference FLOPs, and delivers 1.8–3.8× faster prefill and 2.2–3.4× faster decode than dense baselines on commodity smartphones at comparable INT4 memory.

Training Infrastructure Frontier Model Releases MobileLLM-Pro OLMoE-1B-7B INT4 Quantization +7 more