technique
Supervised Memory Training
techniqueactiveprovisional
supervised-memory-training-0b78c137·1 events·first seen 12d agoAliases: Supervised Memory Training
Co-occurring entities
More like this (12)
Self-Supervised Pretrainingself-trainingsupermemoryLanguage Models Need Sleep: Learning to Self-Modify and Consolidate MemoriesReference-Augmented TrainingUnsupervised Pre-trainingSelf-Supervised Learningsupermemoryaitemporally ordered pre-trainingAttention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix Itsupervised fine-tuningRecalling Too Well: Sycophancy Evaluation and Mitigation in Memory-Augmented Models
Recent events (1)
Supervised Memory Training enables parallel RNN pretraining without backpropagation through time
A new arXiv preprint proposes Supervised Memory Training (SMT), a method that trains recurrent neural networks by reducing the problem to supervised learning on one-step memory transitions, bypassing backpropagation through time entirely. A Transformer-based encoder generates memory labels via a predictive state objective, enabling time-parallel training with O(1) gradient path length between any two tokens. SMT outperforms BPTT on language modeling and pixel sequence modeling tasks across multiple RNN architectures. The approach could enable RNNs to scale more effectively by decoupling memory content from update mechanics.