Almanac
paper

AIR: Adaptive Interleaved Reasoning with Code in MLLMs

paperactiveprovisionalair-adaptive-interleaved-reasoning-with-code-in-mllms-d55836ea·1 events·first seen 40h ago

Aliases: AIR: Adaptive Interleaved Reasoning with Code in MLLMs

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.AI·40h ago·source ↗

AIR: Adaptive Interleaved Reasoning with Code in Multimodal LLMs via Reinforcement Learning

Researchers propose AIR, a system that trains multimodal large language models to adaptively interleave reasoning with code execution for numerical computation tasks, going beyond prior work that focused only on visual operations. The approach combines a two-stage cold-start data pipeline, RL dataset filtering, and a group-constrained reward function for tool-invocation decisions. Experiments show a 6.1 percentage point average improvement on evaluation benchmarks, with interleaved reasoning samples gaining 9.9 pp and tool-use success exceeding 95%.