technique
Manifold Power Iteration
techniqueactiveprovisional
manifold-power-iteration-e2d33f6a·1 events·first seen 6d agoAliases: Manifold Power Iteration
Co-occurring entities
More like this (12)
ManifoldRedesign Mixture-of-Experts Routers with Manifold Power IterationIterated AmplificationG*Powerpower-law scalingIterative Skill-Aware DecompositionMaximal Update Parameterization (μP)Layer LoopingIterated Prisoner's DilemmaReflection LoopsMultimodal Continual Instruction TuningDivide-and-Conquer Partitioning
Recent events (1)
Manifold Power Iteration redesigns MoE routers by aligning rows with expert singular directions
A new arXiv preprint proposes Manifold Power Iteration (MPI), a principled redesign of Mixture-of-Experts router matrices that aligns each router row with the principal singular direction of its associated expert. The method uses a 'Power-then-Retract' paradigm to enforce norm constraints while driving convergence toward these singular directions. Empirical validation spans MoE pretraining at scales from 1B to 11B parameters, showing improved model effectiveness.