Entity · model

OLMo2

modelactiveolmo2-0d7f5e9b·1 events·first seen May 25, 2026

Aliases: OLMo2

Co-occurring entities

Shannon-Hartley Theorem Shannon Scaling Law Pythia quantization-induced degradation catastrophic overtraining signal-to-noise ratio (SNR)

More like this (12)

OLMo-3 OLMo OLMoE OLMo-1B MoE²-LoRA OLMoE-1B-7B OLMoE-1B-7B-0924 CO-LMLM LoCoMo omlx LoMo LLaMA-Omni

Recent events (1)

7arXiv · cs.LG·May 25, 2026·source ↗

Shannon Scaling Law: A Noisy-Channel Framework for LLM Capacity and Non-Monotonic Training Phenomena

Researchers propose the Shannon Scaling Law, a theoretical framework that models LLM training as information transmission over a noisy channel using the Shannon-Hartley theorem. By mapping model parameters to channel bandwidth and training tokens to signal power, the framework introduces a fundamental SNR-based capacity limit that explains non-monotonic phenomena like catastrophic overtraining and quantization-induced degradation that classical power-law scaling laws cannot capture. Validated on Pythia and OLMo2 under Gaussian noise, quantization, and fine-tuning perturbations, the law achieves strong R² scores and successfully extrapolates from 6.9B to 12B parameter models trained on up to 307B tokens. The framework outperforms both classical and perturbation-aware scaling laws, predicting U-shaped performance degradation when SNR is insufficient.

Training Infrastructure Evaluation and Benchmarking Shannon-Hartley Theorem Shannon Scaling Law Pythia +5 more