6arXiv cs.LG (Machine Learning)·25d ago

Self-Generated Replay Nearly Eliminates Catastrophic Forgetting in Language Models

This paper investigates catastrophic forgetting in language models during continual learning, finding that models can use self-generated samples from their own training distribution as effective replay data, nearly eliminating forgetting without requiring stored exemplars. The authors identify two key conditions where forgetting persists: when models are pretrained near capacity saturation (leaving no room for new knowledge), and when low learning rates are used to reduce forgetting at the cost of requiring far more training steps. Self-generated replay breaks this learning-rate/forgetting tradeoff, enabling fast high-learning-rate finetuning without degradation on prior tasks.

Enterprise Deployment Patterns Agent and Tool Ecosystem Alignment and RLHF catastrophic forgetting Language Model Finetuning Continual Learning Self-Generated Replay

Related guides (3)

Enterprise Deployment PatternsTopic guide

Enterprise Deployment Patterns: From AI Demo to Production Reality

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Alignment and RLHFTopic guide

Alignment and RLHF: Teaching AI Models to Behave

Read asBeginner In-depth

Related events (8)

5The Batch·1mo ago·source ↗

Sony and University Researchers Train Robots To Learn Without Catastrophic Forgetting

Researchers from UT Austin, UCLA, Nanyang Technological University, and Sony developed a sequential fine-tuning recipe combining LoRA and on-policy reinforcement learning (GRPO) to reduce catastrophic forgetting in vision-language-action (VLA) models for robotics. Applied to the OpenVLA-OFT model on the LIBERO benchmark, the method achieved 81.2% success on libero-spatial tasks with near-zero forgetting (0.3 percentage point drop), outperforming established continual learning baselines including Dark Experience Replay and Elastic Weight Consolidation. The approach requires no replay of prior task data and also showed modest generalization to unseen tasks. The authors note the method has not yet been tested outside robotics simulation contexts.

Evaluation and Benchmarking Agent and Tool Ecosystem Elastic Weight Consolidation Dark Experience Replay University of California Los Angeles +11 more

5arXiv · cs.AI·2d ago·source ↗

MAST: Mechanism-guided selective unlearning for RLVR-trained reasoning models

Researchers introduce MAST (Mechanism-Aligned Selective Targeting), a method for selectively unlearning capabilities induced by reinforcement learning from verifiable rewards (RLVR) in language models while minimizing collateral damage to retained knowledge. The approach ranks attention-projection tensors by off-principal energy and gradient coupling to identify a targeted subset for update, rather than applying full-parameter gradient ascent. Evaluated on Qwen2.5-Math-1.5B and Qwen3-1.7B-Base, MAST achieves statistically significant forgetting on target MATH problems while preserving GSM8K performance, whereas full-parameter unlearning collapses retained capabilities. The method generalizes across seeds and unlearning objectives (NPO/SimNPO).

AI Safety Research Alignment and RLHF Qwen3-1.7B-Base MATH MAST +2 more

5arXiv · cs.LG·17d ago·source ↗

Sleep paradigm for LLMs enables continual learning and memory consolidation via distillation and RL

A new arXiv preprint proposes a 'Sleep' paradigm for language models that enables continual learning by consolidating short-term in-context memories into long-term parameters. The framework has two stages: Knowledge Seeding (distilling a smaller model's memories into a larger network via on-policy distillation combined with RL-based imitation learning) and Dreaming (self-improvement via RL-generated synthetic curricula without human supervision). Experiments cover long-horizon tasks, continual learning, knowledge incorporation, and few-shot generalization, addressing a known weakness of current LLMs in retaining temporal knowledge across contexts.

Alignment and RLHF Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories Knowledge Seeding Generalized Distillation

5arXiv · cs.LG·8d ago·source ↗

Stable Recovery Manifold hypothesis: catastrophic forgetting as accessibility problem, not information destruction

A new arXiv preprint investigates the geometric structure of recoverability in continual learning using Split CIFAR-100 and a sequentially trained ResNet-18. The authors introduce Recovery Subspace Dimensionality (k_t) and find that recovery dimensionality remains stable across tasks (mean k_t = 8.0) despite substantial representational drift, with principal-angle drift strongly predicting recoverability (r = -0.862). The findings support the Stable Recovery Manifold hypothesis: forgotten knowledge remains compactly decodable, reframing catastrophic forgetting as a manifold-alignment and accessibility problem rather than true information loss.

Evaluation and Benchmarking Split CIFAR-100 Recovery Subspace Dimensionality The Stable Recovery Manifold: Geometric Principles Governing Recoverability in Continual Learning +1 more

5arXiv · cs.LG·22d ago·source ↗

Language Generation in the Limit with Bounded Memory: Characterization via Sperner's Theorem

This paper studies language generation in the limit under bounded memory constraints, extending classical learning theory to the generation setting. The authors characterize when memoryless generation is possible, derive minimax density bounds using Sperner's theorem and symmetric chain decompositions, and show that adaptively chosen memory outperforms sliding-window memory. They also revisit incremental identification in the limit, finding that exact identification fails for collections of three or more languages but an approximate relaxation is achievable for all finite collections.

Evaluation and Benchmarking AI Safety Research Sperner's Theorem Language Generation in the Limit Identification in the Limit +2 more

5arXiv · cs.CL·10d ago·source ↗

Provenance-grounded gating and adaptive recovery improve synthetic post-training data curation

A controlled study examines two underexplored practices in synthetic post-training data pipelines: grounding filtering signals in source provenance and systematically recovering rejected samples rather than discarding them. Using adversarially injected corpora as ground-truth failure labels, the authors find that exact source provenance improves faithfulness gating for stronger judges, that hallucination and reward gates reject largely disjoint populations (making both necessary), and that adaptive recovery via failure diagnosis and targeted regeneration outperforms naive resampling. Generator scale is the primary driver of downstream fine-tuning quality, with filtration and recovery contributing meaningfully but secondarily.

Evaluation and Benchmarking Alignment and RLHF Provenance-Grounded Gating and Adaptive Recovery in Synthetic Post-Training Data Curation

10Openai Blog·1mo ago·source ↗

Language models are few-shot learners

OpenAI published the GPT-3 paper introducing a 175-billion-parameter autoregressive language model demonstrating strong few-shot learning capabilities across a wide range of NLP tasks. The work showed that scaling language models dramatically improves task-agnostic, few-shot performance, often matching or exceeding fine-tuned models without any gradient updates. This paper became a foundational milestone in the development of large language models and the modern AI landscape.

Long Context Evolution Frontier Model Releases GPT-3 Language Models are Few-Shot Learners few-shot learning +4 more

3arXiv · cs.CL·5d ago·source ↗

Continual learning approach for disfluency-aware ASR with explicit disfluency tokens

A new arXiv preprint addresses the challenge of transcribing disfluent speech (hesitations, repetitions, fillers) in ASR systems, which typically omit such markers causing information loss. The authors introduce explicit disfluency tokens into a pretrained ASR model and apply continual learning to adapt across datasets with varying disfluency distributions while mitigating catastrophic forgetting. The work identifies a trade-off between disfluency marker learning and general ASR performance, and finds a consistent cross-attention head mechanism shared across continual learning methods.

Learning to Hear Hesitation: Continual Learning for Disfluency-Aware ASR