model

Qwen3-1.7B-Base

modelactiveprovisionalqwen3-1-7b-base-93013437·1 events·first seen 2d ago

Aliases: Qwen3-1.7B-Base

Co-occurring entities

More like this (12)

Qwen3.5-2B-Base Qwen3-1.7B Qwen3.5-0.8B Qwen-0.5B Qwen2.5-1.5B Qwen3.5-35B-A3B-Base Qwen3-4B-Base Qwen3-8B-Base Qwen3-30B-A3B-Base Qwen3-14B-Base Qwen 3.5 27B Qwen3.6-27B

Recent events (1)

5arXiv · cs.AI·2d ago·source ↗

MAST: Mechanism-guided selective unlearning for RLVR-trained reasoning models

Researchers introduce MAST (Mechanism-Aligned Selective Targeting), a method for selectively unlearning capabilities induced by reinforcement learning from verifiable rewards (RLVR) in language models while minimizing collateral damage to retained knowledge. The approach ranks attention-projection tensors by off-principal energy and gradient coupling to identify a targeted subset for update, rather than applying full-parameter gradient ascent. Evaluated on Qwen2.5-Math-1.5B and Qwen3-1.7B-Base, MAST achieves statistically significant forgetting on target MATH problems while preserving GSM8K performance, whereas full-parameter unlearning collapses retained capabilities. The method generalizes across seeds and unlearning objectives (NPO/SimNPO).

AI Safety Research Alignment and RLHF Qwen3-1.7B-Base MATH MAST +2 more