Almanac
technique

Key-Value Cache

techniqueactiveprovisionalkey-value-cache-1aa0e8d4·1 events·first seen 22d ago

Aliases: Key-Value Cache

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·22d ago·source ↗

Language Models Need Sleep: Periodic Context Consolidation via Fast Weights and SSM Blocks

This paper proposes a sleep-like consolidation mechanism for transformer-based LLMs to address the quadratic scaling of attention with context length. During 'sleep' phases, the model performs N offline recurrent passes over accumulated context, updating fast weights in state-space model (SSM) blocks via a learned local rule, then clears the KV cache. The approach is evaluated on synthetic tasks (cellular automata, multi-hop graph retrieval) and math reasoning, where standard transformers and SSM-attention hybrids fail, with performance scaling with sleep duration N.