Entity · paper

Sparse Subspace-to-Expert Sharing for Task-Agnostic Continual Learning

paperactivesparse-subspace-to-expert-sharing-for-task-agnostic-continual-learning-2726e327·1 events·first seen Jun 8, 2026

Aliases: Sparse Subspace-to-Expert Sharing for Task-Agnostic Continual Learning

Co-occurring entities

LLaMA-7B Qwen3-4B SETA

More like this (12)

Sparse Mixture-of-Experts Multi-Task Learning Unsupervised Continual Clustering via Forward-Backward Knowledge Distillation TailLoR: Protecting Principal Components in Parameter-Efficient Continual Learning Multi-Task Bayesian In-Context Learning KSAA-2026 Shared Task Towards Explainable Adjudicative Variance: Quantifying Judicial Discretion via Gated Multi-Task Learning Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders CHERRY: Compressed Hierarchical Experts with Recurrent Representational Yield Rank-Constrained Subspace Learning (RCSL)Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes Cost-Sensitive Conformal Prediction and Human-in-the-Loop Abstention for Imbalanced High-Stakes Decision Support: A Multi-Domain Benchmark

Recent events (1)

4arXiv · cs.LG·Jun 8, 2026·source ↗

SETA: Sparse Subspace-to-Expert Sharing for Continual Learning in LLMs

Researchers introduce SETA (Mixture of Sparse Experts for Task Agnostic Continual Learning), a framework addressing catastrophic forgetting in LLMs via adaptive sparse subspace decomposition into task-specific and shared expert modules. The approach uses adaptive elastic anchoring and routing-aware regularization to protect shared knowledge at both weight and routing levels. Experiments on LLaMA-2 7B and Qwen3-4B show competitive or superior performance versus continual learning baselines, with strong retention of early-task knowledge.

Evaluation and Benchmarking Open Weights Progress LLaMA-7B Qwen3-4B Sparse Subspace-to-Expert Sharing for Task-Agnostic Continual Learning +1 more