Almanac
paper

Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

paperactiveprovisionallanguage-models-need-sleep-learning-to-self-modify-and-consolidate-memories-f19c9e75·1 events·first seen 14d ago

Aliases: Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.LG·14d ago·source ↗

Sleep paradigm for LLMs enables continual learning and memory consolidation via distillation and RL

A new arXiv preprint proposes a 'Sleep' paradigm for language models that enables continual learning by consolidating short-term in-context memories into long-term parameters. The framework has two stages: Knowledge Seeding (distilling a smaller model's memories into a larger network via on-policy distillation combined with RL-based imitation learning) and Dreaming (self-improvement via RL-generated synthetic curricula without human supervision). Experiments cover long-horizon tasks, continual learning, knowledge incorporation, and few-shot generalization, addressing a known weakness of current LLMs in retaining temporal knowledge across contexts.