paper

AIR: Adaptive Interleaved Reasoning with Code in MLLMs

paperactiveprovisionalair-adaptive-interleaved-reasoning-with-code-in-mllms-d55836ea·1 events·first seen 40h ago

Aliases: AIR: Adaptive Interleaved Reasoning with Code in MLLMs

Co-occurring entities

More like this (12)

Watch, Remember, Reason: Human-View Video Understanding with MLLMs Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning Adaptive Parallel Reasoning Continual LLM Upcycling: A Predictor-Gated Bank-Wise Sparsity Training Recipe for Dense-to-Sparse LLMs Reasoning in Memory (RiM)Towards Root Memories: Benchmarking and Enhancing Implicit Logical Memory Retrieval for Personalized LLMs CLP: Collocation-Length Prediction for Zero-Loss Adaptive Multi-Token Inference code synthesis LLMs Where Does the Answer Come From? Benchmarking View-Level Visual Evidence Identification in Multi-View MLLMs for Autonomous Driving ExpRL: Exploratory RL for LLM Mid-Training MLSkip: Data Skipping for ML Filters via Lightweight Metadata TensorRT-LLM

Recent events (1)

5arXiv · cs.AI·40h ago·source ↗

AIR: Adaptive Interleaved Reasoning with Code in Multimodal LLMs via Reinforcement Learning

Researchers propose AIR, a system that trains multimodal large language models to adaptively interleave reasoning with code execution for numerical computation tasks, going beyond prior work that focused only on visual operations. The approach combines a two-stage cold-start data pipeline, RL dataset filtering, and a group-constrained reward function for tool-invocation decisions. Experiments show a 6.1 percentage point average improvement on evaluation benchmarks, with interleaved reasoning samples gaining 9.9 pp and tool-use success exceeding 95%.

Agent and Tool Ecosystem Alignment and RLHF AIR: Adaptive Interleaved Reasoning with Code in MLLMs OpenAI +1 more