benchmark
Document Visual Question Answering
benchmarkactive
document-visual-question-answering-5518fb46·1 events·first seen 28d agoAliases: Document Visual Question Answering
Co-occurring entities
More like this (12)
Visual Question Answeringvisual document retrievalDocVQADocument AIDocument AI PlaygroundTrace Only What You Need: Structure-Aware On-Demand Hypergraph Memory for Long-Document Question AnsweringVisualMemvisual language modelcomputer visionclarifying-question promptingTable Question AnsweringWhere Does the Answer Come From? Benchmarking View-Level Visual Evidence Identification in Multi-View MLLMs for Autonomous Driving
Recent events (1)
Docmatix: A Large-Scale Dataset for Document Visual Question Answering
Hugging Face released Docmatix, a large-scale dataset designed for Document Visual Question Answering (DocVQA) tasks. The dataset aims to address the scarcity of high-quality training data for document understanding in multimodal models. It is intended to improve fine-tuning of vision-language models on document comprehension tasks.