Spelke Core Knowledge Systems
spelke-core-knowledge-systems-448b886f·1 events·first seen 28d agoAliases: Spelke Core Knowledge Systems
Co-occurring entities
More like this (12)
Recent events (1)
ESI-Bench: A Benchmark for Embodied Spatial Intelligence Closing the Perception-Action Loop
ESI-Bench is a new benchmark for embodied spatial intelligence spanning 10 task categories and 29 subcategories, built on OmniGibson and grounded in Spelke's core knowledge systems. It evaluates agents that must actively deploy perception, locomotion, and manipulation to accumulate task-relevant evidence, rather than passively processing oracle observations. Experiments on state-of-the-art MLLMs reveal that active exploration outperforms passive baselines, but most failures stem from 'action blindness'—poor action choices leading to cascading errors—and a metacognitive gap where models commit prematurely with high confidence regardless of evidence quality. Human studies show humans seek falsifying viewpoints and revise beliefs under contradiction, a capability current models lack.