Almanac
technique

Self-Generated Replay

techniqueactiveprovisionalself-generated-replay-d8ee6d45·1 events·first seen 22d ago

Aliases: Self-Generated Replay

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.LG·22d ago·source ↗

Self-Generated Replay Nearly Eliminates Catastrophic Forgetting in Language Models

This paper investigates catastrophic forgetting in language models during continual learning, finding that models can use self-generated samples from their own training distribution as effective replay data, nearly eliminating forgetting without requiring stored exemplars. The authors identify two key conditions where forgetting persists: when models are pretrained near capacity saturation (leaving no room for new knowledge), and when low learning rates are used to reduce forgetting at the cost of requiring far more training steps. Self-generated replay breaks this learning-rate/forgetting tradeoff, enabling fast high-learning-rate finetuning without degradation on prior tasks.