technique
representation-level steering
techniqueactiveprovisional
representation-level-steering-ecb97e81·1 events·first seen 18d agoAliases: representation-level steering
Co-occurring entities
More like this (12)
representation-level sensitive information leakageActivation SteeringSafeSteerState-Conditioned Dynamic SteeringAgentic Chain-of-Thought SteeringVisual Verification Enables Inference-time Steering and Autonomous Policy ImprovementAgentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoningsteering vectorsGumbel noise steeringsource-level self-rewritingcontext governancedeliberative alignment
Recent events (1)
BeliefTrack: Benchmarking and Improving Contextual Belief Management in LLMs
This paper introduces Contextual Belief Management (CBM) as a framework for studying how LLMs should update, preserve, or ignore information across long-horizon interactions. The authors release BeliefTrack, a closed-world benchmark with symbolic verifiers enabling exact turn-level evaluation across Rule Discovery and Circuit Diagnosis tasks. Vanilla LLMs show severe CBM failures; reinforcement learning with belief-state rewards reduces failure rates by 70.9% on average, while representation-level steering achieves 46.1% reduction. Probing experiments reveal latent belief-state dynamics underlying these failures.