benchmark
HarmVideoBench
benchmarkactiveprovisional
harmvideobench-1d497347·1 events·first seen 7d agoAliases: HarmVideoBench
Co-occurring entities
More like this (12)
Recent events (1)
HarmVideoBench: Multi-layered benchmark for harmful video understanding in large multimodal models
Researchers introduce HarmVideoBench, a diagnostic benchmark of 1,379 videos paired with 4,137 multiple-choice questions designed to evaluate harmful video understanding across three hierarchical dimensions: Observable Evidence, Clip-Internal Meaning, and Beyond-Clip Reasoning. The benchmark addresses limitations in existing work by moving beyond binary classification and requiring explanatory rationales. The authors evaluate 19 leading models and introduce BCR, a method that dynamically retrieves context based on predicted reasoning boundaries, improving macro average performance from 61.7% to 84.4%.