benchmark
WikiVQABench
benchmarkactive
wikivqabench-5c96c89c·1 events·first seen 26d agoAliases: WikiVQABench
Co-occurring entities
More like this (12)
Recent events (1)
WikiVQABench: A Knowledge-Grounded Visual Question Answering Benchmark from Wikipedia and Wikidata
WikiVQABench is a new human-curated VQA benchmark that requires external knowledge beyond visual perception, constructed by combining Wikipedia images, captions, and Wikidata structured knowledge with LLM-generated question candidates reviewed by human annotators. The benchmark evaluates knowledge-intensive reasoning in vision-language models, covering 15 VLMs ranging from 256M to 90B parameters. Accuracy spans 24.7% to 75.6%, indicating meaningful discrimination across model scales. The dataset and code are publicly released.