Entity · other

regional-to-global perception gap

otheractiveregional-to-global-perception-gap-38f8e39d·1 events·first seen May 19, 2026

Aliases: regional-to-global perception gap

Co-occurring entities

Multimodal Large Language Models Thinking-with-Images on-policy self-distillation Vision-OPD

More like this (12)

Perceptual Judgment Bias capability-reliability gap Abstraction Gap Thinking-Acting Gap sim-to-real gap Global Illumination misalignment generalization representational inefficiency Percepta Expert Blindness Effect global attention Cue Visibility Gap

Recent events (1)

6arXiv · cs.CL·May 19, 2026·source ↗

Vision-OPD: On-Policy Self-Distillation for Fine-Grained Visual Understanding in MLLMs

Vision-OPD addresses a 'regional-to-global perception gap' in multimodal LLMs, where models answer fine-grained visual questions more accurately when given cropped evidence regions than full images. The method instantiates a crop-conditioned teacher and full-image-conditioned student from the same MLLM, minimizing token-level divergence along on-policy rollouts to transfer regional perception to the full-image policy. This self-distillation requires no external teacher models, ground-truth labels, reward verifiers, or inference-time tools. Benchmarks show competitive or superior performance against larger open-source, closed-source, and agentic 'Thinking-with-Images' models.

Evaluation and Benchmarking Agent and Tool Ecosystem Multimodal Large Language Models Thinking-with-Images on-policy self-distillation +4 more