paper
Mechanism-Driven Monitors for Preemptive Detection of LLM Training Instability
paperactiveprovisional
mechanism-driven-monitors-for-preemptive-detection-of-llm-training-instability-7bcc5a3e·1 events·first seen 38h agoAliases: Mechanism-Driven Monitors for Preemptive Detection of LLM Training Instability
Co-occurring entities
More like this (12)
LLM-as-monitorA sleep-like consolidation mechanism for LLMsLanguage Model Safety MonitorPaved with True Intents: Intent-Aware Training Improves LLM Safety Classification Across Training RegimesStateful Online MonitorBackdoor Unlearning Generalization: A Path Toward the Removal of Unknown Triggers in LLMsExpRL: Exploratory RL for LLM Mid-TrainingContinual LLM Upcycling: A Predictor-Gated Bank-Wise Sparsity Training Recipe for Dense-to-Sparse LLMsTool MonitorVerifier-in-the-Loop Training (ViL)SIMMER: Benchmarking Latent Failures in LLM Executable Planning with a World ModelWill the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals
Recent events (1)
Mechanism-driven internal monitors detect LLM training instability thousands of steps before loss divergence
A new arXiv preprint proposes mechanism-driven monitoring signals derived from the functional roles of critical modules (low-precision flash attention, MoE routers) to detect training instability before it manifests in loss or gradient norms. The authors derive monitors such as spectral entropy of a QK bilinear decomposition and MoE router indicators, showing via fault-injection experiments that these signals trigger thousands of steps ahead of loss divergence. The work targets a high-cost failure mode in frontier LLM training where instability can persist undetected for thousands of steps on expensive accelerator fleets.