technique
gradient noise scale
techniqueactive
gradient-noise-scale-68adf1a0·1 events·first seen 28d agoAliases: gradient noise scale
Co-occurring entities
More like this (12)
Recent events (1)
How AI Training Scales: Gradient Noise Scale Predicts Batch Parallelizability
OpenAI researchers report that the gradient noise scale — a statistical metric measuring gradient variance relative to mean — reliably predicts the optimal batch size and degree of parallelizability across a wide range of neural network training tasks. The finding suggests that more complex tasks with noisier gradients can benefit from increasingly large batch sizes, removing a potential ceiling on scaling. The work frames training dynamics as a systematic, measurable process rather than empirical art.