Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction
cognitive-episodes-in-llm-reasoning-traces-enable-interpretable-human-item-difficulty-prediction-4f4647f3·1 events·first seen 21h agoAliases: Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction
Co-occurring entities
More like this (12)
Recent events (1)
Epi2Diff framework uses LLM reasoning traces to predict human item difficulty in educational assessment
Researchers introduce Epi2Diff (Episode to Difficulty), a framework that parses Large Reasoning Model (LRM) reasoning traces into structured cognitive episode sequences to predict how difficult test items are for humans. The approach extracts features from reasoning dynamics—effort allocation, state transitions, iteration patterns—and combines them with semantic item representations. Experiments on four real-world difficulty datasets, including SAT-derived benchmarks, show an 8.1% average relative gain over supervised LLM fine-tuning baselines. The work provides interpretable process evidence for educational measurement without requiring costly human calibration.