Entity · technique

Supervised Memory Training

techniqueactivesupervised-memory-training-0b78c137·1 events·first seen Jun 5, 2026

Aliases: Supervised Memory Training

Co-occurring entities

backpropagation through time Pretraining Recurrent Networks without Recurrence

More like this (12)

Self-Supervised Pretraining Holonomy Memory Reinforcement Learning self-training supermemory Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories Reference-Augmented Training Associative Recurrent Memory Transformer Unsupervised Pre-training Self-Supervised Learning supermemoryai temporally ordered pre-training Self-Guided Test-Time Training

Recent events (1)

6arXiv · cs.LG·Jun 5, 2026·source ↗

Supervised Memory Training enables parallel RNN pretraining without backpropagation through time

A new arXiv preprint proposes Supervised Memory Training (SMT), a method that trains recurrent neural networks by reducing the problem to supervised learning on one-step memory transitions, bypassing backpropagation through time entirely. A Transformer-based encoder generates memory labels via a predictive state objective, enabling time-parallel training with O(1) gradient path length between any two tokens. SMT outperforms BPTT on language modeling and pixel sequence modeling tasks across multiple RNN architectures. The approach could enable RNNs to scale more effectively by decoupling memory content from update mechanics.

Training Infrastructure backpropagation through time Supervised Memory Training Pretraining Recurrent Networks without Recurrence