Entity · technique

Alternating Token-Weighted Unlearning

techniqueactivealternating-token-weighted-unlearning-a17257f8·1 events·first seen Jun 5, 2026

Aliases: Alternating Token-Weighted Unlearning

Co-occurring entities

More like this (12)

Backdoor Unlearning Generalization: A Path Toward the Removal of Unknown Triggers in LLMs inference-time behavioural unlearning TILDE: TILt-based Distributional Erasure for Concept Unlearning Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes Uncertainty-based Debiasing and Unlearning for Decontamination Selective Ground Truth Token Training Localized Adaptation Reveals Distinct Learning Signatures in Transformers Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining Entropy-Regularized Reinforcement Learning Operator Learning FORCE: Efficient VLA Reinforcement Fine-Tuning via Value-Calibrated Warm-up and Self-Distillation Active Offline-to-Online Reinforcement Learning

Recent events (1)

5arXiv · cs.CL·Jun 5, 2026·source ↗

ATWU: Token-level importance learning improves LLM unlearning via retain-conflict criterion

This paper introduces Alternating Token-Weighted Unlearning (ATWU), a framework that learns which tokens in a forget sample are most relevant to unlearning by characterizing their conflict with the retain objective. Rather than relying on auxiliary models or heuristics, ATWU jointly learns token forget-specificity and model parameters using a lightweight linear scorer over hidden states. Evaluated on TOFU and RWKU benchmarks, ATWU achieves state-of-the-art forget-retain trade-offs and produces token-level scores that align with ground-truth forget-specific spans.

Evaluation and Benchmarking AI Safety Research RWKU Alternating Token-Weighted Unlearning TOFU