benchmark

VQA-RAD

benchmarkactiveprovisionalvqa-rad-c47a7189·1 events·first seen 2d ago

Aliases: VQA-RAD

Co-occurring entities

RefRad2D Slake RadGrounder

More like this (12)

CXR-VQA GQA StrategyQA PQuAD DocVQA FreshQA SQuAD MTVQA SQA3D IndQA VQ-VAE Omega-QVLA

Recent events (1)

5arXiv · cs.CL·2d ago·source ↗

RefRad2D dataset and RadGrounder model enable spatially grounded radiology VLMs without manual annotations

Researchers introduce RefRad2D, a 1.2M-pair bilingual (German/English) CT and MR image-text dataset generated automatically via LLM curation and automated segmentation, requiring no manual spatial annotations. The accompanying RadGrounder model jointly performs report generation, VQA, and spatial grounding via bounding-box or segmentation outputs. On external benchmarks Slake and VQA-RAD, RadGrounder matches specialized medical VLMs while adding grounding supervision without degrading language quality. The work demonstrates that large-scale automatically curated clinical data can transfer to downstream medical VQA tasks.

Evaluation and Benchmarking Multimodal Progress RefRad2D Slake RadGrounder +1 more