paper
Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning
paperactiveprovisional
agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning-8f8c0b82·1 events·first seen 13d agoAliases: Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning
Co-occurring entities
More like this (12)
Agentic Chain-of-Thought SteeringPredicting Future Behaviors in Reasoning Models Enables Better SteeringBeyond the Commitment Boundary: Probing Epiphenomenal Chain-of-Thought in Large Reasoning ModelsWhen the Chain of Thought Knows Better: Failure Modes in Multi-Turn Reasoning ModelsChain-of-Thought ReasoningVisual Verification Enables Inference-time Steering and Autonomous Policy ImprovementAI-driven constraint reasoningReasoning as Pattern Matching: Shared Mechanisms in Human and LLM Everyday ReasoningFrom Correctness to Utility: Gain-Based Prefix Evaluation for LLM ReasoningDoes Reasoning Preserve Alignment? On the Trustworthiness of Large Reasoning ModelsOperads for compositional reasoning in LLMsReasoning Language Models
Recent events (1)
ACTS: Agentic Chain-of-Thought Steering for efficient and controllable LLM reasoning
Researchers introduce Agentic Chain-of-Thought Steering (ACTS), a framework that formulates inference-time reasoning control as a Markov decision process, where a controller agent adaptively steers a frozen reasoner by issuing reasoning strategy directives and steering phrases at each step. The controller is initialized from synthetic steering trajectories with multi-budget augmentation and further optimized via reinforcement learning with budget-conditioned reward shaping. ACTS matches full-thinking performance with significant token savings and enables controllable accuracy-efficiency trade-offs across multiple benchmarks and reasoner models.