Almanac
technique

Global-batch Load Balancing

techniqueactiveglobal-batch-load-balancing-fef2ab86·1 events·first seen 1mo ago

Aliases: Global-batch Load Balancing

Co-occurring entities

More like this (12)

Recent events (1)

6Qwen Research·1mo ago·source ↗

Global-batch Load Balancing for MoE LLM Training from Qwen

Qwen Research introduces a global-batch load balancing technique for Mixture-of-Experts (MoE) LLM training, claiming it is nearly a 'free lunch' improvement. The method addresses expert load imbalance across training batches, a known efficiency and quality bottleneck in MoE architectures. The approach targets the router and expert activation dynamics in transformer-based MoE layers.