dataset
ACL Anthology
datasetactive
acl-anthology-7c0ed763·1 events·first seen 27d agoAliases: ACL Anthology
Co-occurring entities
More like this (12)
Recent events (1)
ACL-Verbatim: Hallucination-Free Extractive QA System for Research Papers
The paper introduces ACL-Verbatim, an extractive question answering system built on VerbatimRAG that maps user queries directly to verbatim text spans in ACL Anthology papers, eliminating hallucination by design. The authors contribute a new ground-truth benchmark dataset created via human NLP-researcher annotation over synthetic queries generated using a ScIRGen-based pipeline. A 150M-parameter ModernBERT token classifier trained on silver supervision achieves the best word-level F1 of 53.6, outperforming the strongest LLM-based extractor at 48.7. The work demonstrates that smaller extractive models can outperform large generative LLMs on precision-critical retrieval tasks.