Entity · dataset

FOXGLOVE

datasetactivefoxglove-ae5b64e7·1 events·first seen Jun 5, 2026

Aliases: FOXGLOVE

More like this (12)

FROG BLOOM FLEURS LEAF-X FLORES Greenoaks BLOOMZ Flower GiGPO Petals FOLIO GraphGPO

Recent events (1)

4arXiv · cs.CL·Jun 5, 2026·source ↗

FOXGLOVE dataset enables systematic comparison of LLM vs. expert writing feedback on argumentative essays

Researchers introduce FOXGLOVE, a dataset of 2,340 feedback comments on 69 twelfth-grade argumentative essays, comprising 696 comments from trained writing instructors and 1,644 from four frontier LLMs under a shared protocol. The study finds that while instructors and LLMs distribute feedback similarly across goals and essay positions, they diverge on which specific sentences to address. LLM feedback receives higher quality ratings from instructors on most dimensions, but the advantage appears largely attributable to comment length rather than substantive quality. The dataset enables systematic evaluation of human-LLM alignment in educational feedback generation.

Evaluation and Benchmarking FOXGLOVE