Almanac
benchmark

SQA3D

benchmarkactiveprovisionalsqa3d-e2fa14fc·1 events·first seen 3d ago

Aliases: SQA3D

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·3d ago·source ↗

OneCanvas achieves state-of-the-art 3D scene understanding via panoramic reprojection in VLMs

OneCanvas is a new method for 3D scene understanding in Vision-Language Models that aggregates multi-view patch features onto a single equirectangular panoramic canvas using depth and camera pose, avoiding complex geometry encoders or large training budgets. A 3D position embedding restores metric depth information lost during angular projection, and a spatial pretraining curriculum generates on-the-fly supervision for spatial reasoning tasks. The approach achieves state-of-the-art results on SQA3D and VSI-Bench benchmarks while using an order of magnitude less training compute than competing methods, and supports situated reasoning relevant to robotics and embodied AI.