dataset

PadChest-GR

datasetactiveprovisionalpadchest-gr-84573452·1 events·first seen 12h ago

Aliases: PadChest-GR

Co-occurring entities

CheXpert Plus MIMIC-CXR SHOVIR

More like this (12)

PCGrad MedCPT CVDP GRASP GRASP Lab PALS CV-Bench-2D GP-UCB CoRP PlanBench-XL DPG Benchmark CARV

Recent events (1)

5arXiv · cs.CL·12h ago·source ↗

SHOVIR benchmark exposes vision shortcut learning failures in radiology report generation VLMs

Researchers introduce SHOVIR, a benchmark for detecting 'vision shortcut' behavior in Vision-Language Models applied to Radiology Report Generation (RRG), where models achieve high scores by exploiting learned priors rather than actual image evidence. The benchmark extends MIMIC-CXR and PadChest-GR with per-box CheXpert labels and uses localized occlusion experiments to isolate two failure modes: direct shortcuts (findings persist after visual evidence is removed) and contextual shortcuts (detection degrades when co-occurring pathologies are occluded). Evaluating eight state-of-the-art VLMs, the authors find that high report quality does not correlate with strong spatial grounding, revealing a systematic blind spot in current RRG evaluation protocols.

Evaluation and Benchmarking Multimodal Progress CheXpert Plus MIMIC-CXR PadChest-GR +1 more