Pearson correlation
pearson-correlation-1a8dd62b·1 events·first seen 15d agoAliases: Pearson correlation
Co-occurring entities
More like this (12)
Recent events (1)
PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning in MLLMs
PaSBench-Video is a 740-video benchmark designed to evaluate whether multimodal large language models can issue timely, accurate safety warnings during the window between a visible danger sign and an accident. Videos span four domains (driving, healthcare, daily life, industrial production) and are annotated with frame-level risk onset and accident boundaries, requiring causal temporal reasoning rather than static scene classification. Testing 13 MLLMs reveals no model exceeds 20% on the strictest metric, with recall strongly coupled to false-positive rate (Pearson r=0.64), indicating models rely on scene-level activity cues rather than genuine hazard reasoning. Performance varies sharply by domain, with driving being particularly problematic due to visual similarity between routine and hazardous scenes.