An Empirical Analysis of Factual Errors in Human-Written Text and its Application
an-empirical-analysis-of-factual-errors-in-human-written-text-and-its-application-24738bb6·1 events·first seen 40h agoAliases: An Empirical Analysis of Factual Errors in Human-Written Text and its Application
Co-occurring entities
More like this (12)
Recent events (1)
Empirical taxonomy of factual errors in human-written text reveals LLM detection gaps
A new arXiv paper introduces a taxonomy of factual error types in human-written text, derived from analysis of newspaper article corrections, identifying categories like kanji misconversions and numeral classifier errors absent from existing hallucination benchmarks. The authors evaluate several LLMs on Factual Error Detection (FED) tasks using both synthetic and real correction data. Even high-performance models like GPT-5.4 achieve only ~52% word-level F1 on synthetic data, underscoring the difficulty of detecting human-induced factual errors versus LLM hallucinations. The work highlights a neglected subproblem in factual accuracy research as the field has shifted focus toward LLM-generated hallucinations.