Entity · other

Thinking-Acting Gap

otheractivethinking-acting-gap-dedd7879·1 events·first seen May 28, 2026

Aliases: Thinking-Acting Gap

Co-occurring entities

More like this (12)

Abstraction Gap Bridging Talk and Thought: Understanding Dialogue Dynamics Across Collaborative Problem-Solving Contexts Vision-Language-Action models adaptive thinking extended thinking Beyond the Commitment Boundary: Probing Epiphenomenal Chain-of-Thought in Large Reasoning Models Research Gap Inference regional-to-global perception gap Program-of-Thought BadWAM: When World-Action Models Dream Right but Act Wrong capability-reliability gap Measuring the Gap Between Human and LLM Research Ideas

Recent events (1)

7arXiv · cs.CL·May 28, 2026·source ↗

AXPO: Agent Explorative Policy Optimization Addresses Thinking-Acting Gap in Multimodal Agentic Reasoning

This paper identifies a structural asymmetry in agentic reasoning called the 'Thinking-Acting Gap,' where tool use is attempted in only ~30% of rollouts under standard RL training (GRPO), and all-wrong tool-using subgroups suppress learning signals. The authors propose AXPO (Agent eXplorative Policy Optimization), which fixes the thinking prefix and resamples tool calls for all-wrong subgroups, combined with uncertainty-based prefix selection. Evaluated across nine multimodal benchmarks on Qwen3-VL-Thinking at multiple scales, SFT+AXPO outperforms SFT+GRPO by +1.8pp on both Pass@1 and Pass@4 at 8B, with the 8B model surpassing the 32B baseline on Pass@4 using 4× fewer parameters.

Frontier Model Releases Agent and Tool Ecosystem AXPO GRPO Thinking-Acting Gap +4 more