Almanac
technique

gradient noise scale

techniqueactivegradient-noise-scale-68adf1a0·1 events·first seen 28d ago

Aliases: gradient noise scale

Co-occurring entities

More like this (12)

Recent events (1)

6Openai Blog·28d ago·source ↗

How AI Training Scales: Gradient Noise Scale Predicts Batch Parallelizability

OpenAI researchers report that the gradient noise scale — a statistical metric measuring gradient variance relative to mean — reliably predicts the optimal batch size and degree of parallelizability across a wide range of neural network training tasks. The finding suggests that more complex tasks with noisier gradients can benefit from increasingly large batch sizes, removing a potential ceiling on scaling. The work frames training dynamics as a systematic, measurable process rather than empirical art.