Almanac
benchmark

VSI-Bench

benchmarkactiveprovisionalvsi-bench-6104572c·1 events·first seen 2d ago

Aliases: VSI-Bench

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·2d ago·source ↗

OneCanvas achieves state-of-the-art 3D scene understanding via panoramic reprojection in VLMs

OneCanvas is a new method for 3D scene understanding in Vision-Language Models that aggregates multi-view patch features onto a single equirectangular panoramic canvas using depth and camera pose, avoiding complex geometry encoders or large training budgets. A 3D position embedding restores metric depth information lost during angular projection, and a spatial pretraining curriculum generates on-the-fly supervision for spatial reasoning tasks. The approach achieves state-of-the-art results on SQA3D and VSI-Bench benchmarks while using an order of magnitude less training compute than competing methods, and supports situated reasoning relevant to robotics and embodied AI.