technique
ORBIT
techniqueactiveprovisional
orbit-e43593b3·1 events·first seen 41h agoAliases: ORBIT
Co-occurring entities
More like this (12)
Recent events (1)
ORBIT: Training-free multi-attribute behavioral steering via orthogonal subspace rotation
Researchers introduce ORBIT (Orthogonal Rotation-Based Intervention Technique), a training-free activation steering method that simultaneously controls multiple behavioral attributes in language models. The approach constructs a joint subspace from per-attribute steering planes via SVD and applies a single norm-preserving rotation, avoiding the norm imbalance and directional cancellation problems of naive vector summation. The authors also release TraitFactory, a new multi-attribute behavioral benchmark, and evaluate across Llama-3.2-3B, Qwen-2.5-7B, and Llama-3.1-8B. ORBIT outperforms existing training-free baselines on multi-attribute steering while better preserving output coherence.