dataset

ASAP++

datasetactiveprovisionalasap--95ba8e24·2 events·first seen 2d ago

Aliases: ASAP++

Co-occurring entities

From Texts to Scores: Tracing the Emergence of Essay Quality Representations in Large Language Models CSEE ENEM PsyScore Graded Partial Credit Model

More like this (12)

ASAM AASIST FunASR APEX-Agents-AA AdaSR AMP ProAct ACROS USAD 2.0 ActiveSAM FAST ACE

Recent events (2)

4arXiv · cs.CL·2d ago·source ↗

Mechanistic analysis of how LLMs encode essay quality in internal representations

Researchers systematically probe the hidden representations of eight LLMs across three essay datasets (ASAP++, CSEE, ENEM) to understand how automated essay scoring (AES) works internally. Using linear probing, dimensionality reduction, and neuron-level analysis, they find essay quality is encoded in a linearly accessible form that emerges progressively across layers and partially transfers across prompts. Individual 'essay scoring neurons' are identified whose activations correlate with scores and respond to targeted interventions, with longer essays relying more on deeper layers. The work contributes to mechanistic interpretability of LLM-based scoring systems.

Evaluation and Benchmarking From Texts to Scores: Tracing the Emergence of Essay Quality Representations in Large Language Models CSEE ENEM +1 more

4arXiv · cs.CL·2d ago·source ↗

PsyScore: Psychometrically-aware framework integrating IRT scoring with ZPD-scaffolded LLM feedback for essay assessment

PsyScore is a new framework for Automated Essay Scoring (AES) that unifies diagnostic assessment and instructional feedback through a shared latent ability representation. It combines a neural Item Response Theory scorer (based on the Graded Partial Credit Model) with a multi-agent LLM feedback generator conditioned on estimated student proficiency, operationalizing Vygotsky's Zone of Proximal Development. Experiments on the ASAP++ dataset show competitive scoring performance alongside more pedagogically aligned feedback. The work addresses a gap between psychometric rigor and LLM-based adaptive instruction.

Evaluation and Benchmarking Agent and Tool Ecosystem PsyScore Graded Partial Credit Model ASAP++