Almanac
paper

When LLMs Read Tables Carelessly: Measuring and Reducing Data Referencing Errors

paperactiveprovisionalwhen-llms-read-tables-carelessly-measuring-and-reducing-data-referencing-errors-17ebb19e·1 events·first seen 2d ago

Aliases: When LLMs Read Tables Carelessly: Measuring and Reducing Data Referencing Errors

More like this (12)

Recent events (1)

4arXiv · cs.AI·2d ago·source ↗

Systematic study of LLM data referencing errors in tables, with lightweight critic model mitigation

A new arXiv paper introduces the first systematic evaluation of data referencing errors (DREs) — incorrect citation or omission of table values — across LLMs ranging from 1.7B to 20B parameters. The authors find DREs are pervasive across all tested models and tasks, compromising intermediate reasoning steps beyond just final-answer accuracy. They demonstrate that a critic-based filtering and rejection sampling approach improves answer accuracy by up to 12%, and train a lightweight 4B critic model achieving 78.2% F1 on detecting DREs both in- and out-of-distribution.