Almanac
benchmark

CinePile 2.0

benchmarkactivecinepile-2-0-486576a4·1 events·first seen 28d ago

Aliases: CinePile 2.0

Co-occurring entities

More like this (12)

Recent events (1)

5Hugging Face Blog·28d ago·source ↗

CinePile 2.0 - Making Stronger Datasets with Adversarial Refinement

CinePile 2.0 is a new video question-answering benchmark and dataset designed to evaluate long-form video understanding in multimodal models. The dataset uses adversarial refinement techniques to reduce spurious correlations and improve question difficulty, making it harder for models to answer correctly without genuine video comprehension. It targets a known weakness in existing video benchmarks where models can exploit language priors rather than visual content.