TriViewBench
triviewbench-9eb25594·1 events·first seen 22h agoAliases: TriViewBench
Co-occurring entities
More like this (12)
Recent events (1)
TriViewBench: Controlled benchmark reveals fundamental multi-view spatial reasoning failures in MLLMs
Researchers introduce TriViewBench, a synthetic 3D benchmark of 1,923 scenes and 14K+ QA pairs designed to probe multi-view structural reasoning in MLLMs under controlled complexity scaling. Evaluating 18 open- and closed-source models, the study finds a universal capability hierarchy (Local Decision > Object Counting > Global Recovery) with severe performance collapse on Global Recovery tasks (80% relative drop at highest complexity). Chain-of-Thought prompting provides near-zero benefit, suggesting the bottleneck is cross-view spatial representation rather than reasoning strategy. The work identifies two mechanistically distinct failure modes in object counting: occlusion blindness causing undercounting in single-view tasks and cross-view identity confusion causing overcounting in multi-view tasks.