benchmark
STAGE-Eval
benchmarkactiveprovisional
stage-eval-405da3d0·1 events·first seen 47h agoAliases: STAGE-Eval
Co-occurring entities
More like this (12)
Recent events (1)
STAGE pipeline generates source-grounded training data for text-to-JSON extraction
Researchers introduce STAGE (Spreadsheet-grounded Text-to-JSON Artifact GEneration), a data generation pipeline that uses LLMs to synthesize training data for structured extraction from long unstructured documents, validating outputs against underlying spreadsheets. Evaluated on STAGE-Eval, an 851-example benchmark, the pipeline substantially improves Qwen3-4B performance, raising exact match from 31.37% to 74.27% and value accuracy from 45.46% to 90.69%. The work targets a practical bottleneck in enterprise document processing: reliably converting financial filings and clinical records into machine-readable JSON.