N-body simulator
n-body-simulator-040753df·1 events·first seen 22d agoAliases: N-body simulator
Co-occurring entities
More like this (12)
Recent events (1)
DiscoverPhysics: Interactive Benchmark for LLM Scientific Discovery in Novel Physics Worlds
DiscoverPhysics is a new interactive benchmark that tests LLM agents on their ability to discover laws of motion in 22 simulated worlds with deliberately non-standard physics, including screened gravity, fractional-power interactions, and hidden dark-matter-like particles. Agents must propose experiments, observe N-body trajectory data, and submit both natural-language explanations and Python implementations of inferred laws. Evaluation across eleven frontier models shows the best agents pass only half the worlds, with consistent failures on latent-structure problems and a substantial gap between open-source and commercial models. The benchmark reveals that predictive accuracy and conceptual understanding are dissociable, and that genuine hypothesis refinement through well-designed experiments is required for high explanation scores.