technique

TAURA

techniqueactiveprovisionaltaura-edda8556·1 events·first seen 3d ago

Aliases: TAURA

Co-occurring entities

OmniAgent Qwen2.5-VL-72B LVBench Native Active Perception as Reasoning for Omni-Modal Understanding VideoMME

More like this (12)

TAHOE TauricResearch TREAD UI-TARS-desktop AuRA TARFlow TOFU TAC (Travel Agent Compassion)ASTRA TAU-bench CORA Luna

Recent events (1)

6arXiv · cs.CL·3d ago·source ↗

OmniAgent: POMDP-based active perception agent for long video understanding with test-time scaling

Researchers introduce OmniAgent, a multimodal agent that reformulates long video understanding as a POMDP-based iterative Observation-Thought-Action cycle, selectively distilling audio-visual cues into persistent textual memory rather than processing all frames uniformly. The system uses Agentic Supervised Fine-Tuning and a novel reinforcement learning method (TAURA) with turn-level entropy for credit assignment. OmniAgent demonstrates positive test-time scaling and achieves state-of-the-art open-source results across ten benchmarks, with its 7B model outperforming Qwen2.5-VL-72B on LVBench (50.5% vs. 47.3%).

Inference Economics Agent and Tool Ecosystem OmniAgent Qwen2.5-VL-72B LVBench +4 more