benchmark
TABVERSE
benchmarkactiveprovisional
tabverse-b5b6d378·1 events·first seen 8d agoAliases: TABVERSE
More like this (12)
Recent events (1)
TABVERSE benchmark isolates table representation effects across formats in LLMs and VLMs
TABVERSE is a new controlled multimodal benchmark that evaluates LLMs and VLMs on table understanding by holding table content fixed while varying representation format (HTML, Markdown, LaTeX, rendered images). Evaluation across three tasks—Question Answering, Structural Understanding, and Structure Reconstruction—shows that representation choice substantially affects performance, with structured text generally outperforming rendered images and HTML being the most robust text format. The benchmark addresses a gap in existing evaluations where content, format, and modality vary simultaneously, making it impossible to isolate representation effects.