technique
SETA
techniqueactiveprovisional
seta-dae69119·1 events·first seen 9d agoAliases: SETA
Co-occurring entities
More like this (12)
Recent events (1)
SETA: Sparse Subspace-to-Expert Sharing for Continual Learning in LLMs
Researchers introduce SETA (Mixture of Sparse Experts for Task Agnostic Continual Learning), a framework addressing catastrophic forgetting in LLMs via adaptive sparse subspace decomposition into task-specific and shared expert modules. The approach uses adaptive elastic anchoring and routing-aware regularization to protect shared knowledge at both weight and routing levels. Experiments on LLaMA-2 7B and Qwen3-4B show competitive or superior performance versus continual learning baselines, with strong retention of early-task knowledge.