Entity · benchmark

WikiVQABench

benchmarkactivewikivqabench-5c96c89c·1 events·first seen May 21, 2026

Aliases: WikiVQABench

Co-occurring entities

large language models Wikidata Vision-Language Models Wikipedia

More like this (12)

VitaBench VBench EQ-Bench DocVQA AdvBench QVQ-Max GraphVid-Bench SpecBench VerifierBench VQ-VAE SimpleQA ProverBench

Recent events (1)

5arXiv · cs.AI·May 21, 2026·source ↗

WikiVQABench: A Knowledge-Grounded Visual Question Answering Benchmark from Wikipedia and Wikidata

WikiVQABench is a new human-curated VQA benchmark that requires external knowledge beyond visual perception, constructed by combining Wikipedia images, captions, and Wikidata structured knowledge with LLM-generated question candidates reviewed by human annotators. The benchmark evaluates knowledge-intensive reasoning in vision-language models, covering 15 VLMs ranging from 256M to 90B parameters. Accuracy spans 24.7% to 75.6%, indicating meaningful discrimination across model scales. The dataset and code are publicly released.

Evaluation and Benchmarking Multimodal Progress large language models Wikidata WikiVQABench +2 more