benchmark

SQA3D

benchmarkactiveprovisionalsqa3d-e2fa14fc·1 events·first seen 3d ago

Aliases: SQA3D

Co-occurring entities

SPBench VSI-Bench OneCanvas

More like this (12)

SQuAD GQA StrategyQA SimpleQA FreshQA VQA-RAD PQuAD MPI3D SAM 3D HM3D IndQA QUBRIC

Recent events (1)

6arXiv · cs.AI·3d ago·source ↗

OneCanvas achieves state-of-the-art 3D scene understanding via panoramic reprojection in VLMs

OneCanvas is a new method for 3D scene understanding in Vision-Language Models that aggregates multi-view patch features onto a single equirectangular panoramic canvas using depth and camera pose, avoiding complex geometry encoders or large training budgets. A 3D position embedding restores metric depth information lost during angular projection, and a spatial pretraining curriculum generates on-the-fly supervision for spatial reasoning tasks. The approach achieves state-of-the-art results on SQA3D and VSI-Bench benchmarks while using an order of magnitude less training compute than competing methods, and supports situated reasoning relevant to robotics and embodied AI.

Evaluation and Benchmarking Multimodal Progress SPBench VSI-Bench OneCanvas +1 more