dataset
FOXGLOVE
datasetactiveprovisional
foxglove-ae5b64e7·1 events·first seen 11d agoAliases: FOXGLOVE
More like this (12)
Recent events (1)
FOXGLOVE dataset enables systematic comparison of LLM vs. expert writing feedback on argumentative essays
Researchers introduce FOXGLOVE, a dataset of 2,340 feedback comments on 69 twelfth-grade argumentative essays, comprising 696 comments from trained writing instructors and 1,644 from four frontier LLMs under a shared protocol. The study finds that while instructors and LLMs distribute feedback similarly across goals and essay positions, they diverge on which specific sentences to address. LLM feedback receives higher quality ratings from instructors on most dimensions, but the advantage appears largely attributable to comment length rather than substantive quality. The dataset enables systematic evaluation of human-LLM alignment in educational feedback generation.