Almanac
product

Epi2Diff

productactiveprovisionalepi2diff-dee6b343·1 events·first seen 22h ago

Aliases: Epi2Diff

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.AI·22h ago·source ↗

Epi2Diff framework uses LLM reasoning traces to predict human item difficulty in educational assessment

Researchers introduce Epi2Diff (Episode to Difficulty), a framework that parses Large Reasoning Model (LRM) reasoning traces into structured cognitive episode sequences to predict how difficult test items are for humans. The approach extracts features from reasoning dynamics—effort allocation, state transitions, iteration patterns—and combines them with semantic item representations. Experiments on four real-world difficulty datasets, including SAT-derived benchmarks, show an 8.1% average relative gain over supervised LLM fine-tuning baselines. The work provides interpretable process evidence for educational measurement without requiring costly human calibration.