Almanac
paper

Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction

paperactiveprovisionalcognitive-episodes-in-llm-reasoning-traces-enable-interpretable-human-item-difficulty-prediction-4f4647f3·1 events·first seen 21h ago

Aliases: Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.AI·21h ago·source ↗

Epi2Diff framework uses LLM reasoning traces to predict human item difficulty in educational assessment

Researchers introduce Epi2Diff (Episode to Difficulty), a framework that parses Large Reasoning Model (LRM) reasoning traces into structured cognitive episode sequences to predict how difficult test items are for humans. The approach extracts features from reasoning dynamics—effort allocation, state transitions, iteration patterns—and combines them with semantic item representations. Experiments on four real-world difficulty datasets, including SAT-derived benchmarks, show an 8.1% average relative gain over supervised LLM fine-tuning baselines. The work provides interpretable process evidence for educational measurement without requiring costly human calibration.