Entity · dataset

NuScenes

datasetactivenuscenes-2a7f7339·1 events·first seen Jun 9, 2026

Aliases: NuScenes

Co-occurring entities

Where Does the Answer Come From? Benchmarking View-Level Visual Evidence Identification in Multi-View MLLMs for Autonomous Driving

More like this (12)

Midscene Natural Scenes Dataset Natural Stories TinyStories SCENT SpaceNum NERSC ScIRGen Unified Scenario Engine Seedance 2.0 Project Numina InternScience

Recent events (1)

5arXiv · cs.CL·Jun 9, 2026·source ↗

Benchmark for view-level visual evidence identification in multi-view MLLMs for autonomous driving

A new arXiv preprint introduces a multi-view visual question answering benchmark targeting evidence-source identification in autonomous driving scenarios. Given six synchronized NuScenes camera views and a question, models must identify which camera view supports the answer — not just produce a correct answer. The 122-pair benchmark spans causality, counterfactual reasoning, and intent prediction, and exposes grounding failures that answer-only evaluation misses. The work addresses a meaningful gap between answer accuracy and correct visual grounding in safety-critical multimodal systems.

Evaluation and Benchmarking Multimodal Progress NuScenes Where Does the Answer Come From? Benchmarking View-Level Visual Evidence Identification in Multi-View MLLMs for Autonomous Driving