mean-field theory
mean-field-theory-d10647d2·1 events·first seen 22d agoAliases: mean-field theory
Co-occurring entities
More like this (12)
Recent events (1)
Hamiltonian Probability Gradient Flow Analysis of the Muon Optimizer
This paper develops a rigorous theoretical framework for the Muon optimizer by interpreting its regularized orthogonalization map as the gradient of a Fenchel-dual smoothing of the nuclear norm, identifying Muon updates as mirror/prox steps with momentum as dual coordinates. The authors lift this structure to probability measures over matrix-valued parameters, deriving a mean-field phase-space equation that constitutes a damped Hamiltonian probability dynamics with monotonically decreasing Hamiltonian energy. Exponential convergence rates are established under gradient-dominance and curvature assumptions, and propagation-of-chaos guarantees are provided for the interacting particle system. The framework extends to transformer mixture-of-experts architectures via blockwise Muon probability flows.