Almanac
technique

Num2Space

techniqueactiveprovisionalnum2space-10c72eb1·1 events·first seen 22d ago

Aliases: Num2Space

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·22d ago·source ↗

SPACENUM: Revisiting Spatial Numerical Understanding in VLMs

SpaceNum is a new evaluation framework probing whether Vision-Language Models genuinely ground numerical outputs (coordinates, action magnitudes) in spatial perception, rather than relying on shallow cues. The benchmark defines two bidirectional tasks—Num2Space and Space2Num—across dynamic and static spatial settings. Results show current VLMs perform near random chance on spatial numerical grounding, with explicit reasoning providing only marginal improvement and fine-tuning offering partial gains.