Entity · paper

EDIT: Evidence-Diagnosed Intervention Training for Rule-Faithful LLM Grading

paperactiveedit-evidence-diagnosed-intervention-training-for-rule-faithful-llm-grading-cacd610a·1 events·first seen Jun 5, 2026

Aliases: EDIT: Evidence-Diagnosed Intervention Training for Rule-Faithful LLM Grading

Co-occurring entities

Evidence-Diagnosed Intervention Training

More like this (12)

Evidence-Diagnosed Intervention Training LLM Detection as an Intervention: Downstream Impact under Strategic User Behavior Relaxing Faithfulness with Intervention-Only Causal Discovery Beyond Sycophancy: Structured Resistance and Compliance in LLM Moral Reasoning Measuring Epistemic Resilience of LLMs Under Misleading Medical Context What Do Safety-Aligned LLMs Learn From Mixed Compliance Demonstrations?When the Judge Changes, So Does the Measurement: Auditing LLM-as-Judge Reliability Can LLMs Judge Better Than They Generate? Evaluating Task Asymmetry, Mechanistic Interpretability and Transferability for In-Context QA Reason-Mediated Behavioral Models for Auditing LLM Social Simulators Judgement-of-Learning Revising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online Discussions Grounding LLM Reasoning under Incomplete Graph Evidence

Recent events (1)

4arXiv · cs.CL·Jun 5, 2026·source ↗

EDIT framework trains more rubric-faithful LLM graders via internal-state diagnostics

Researchers introduce Evidence-Diagnosed Intervention Training (EDIT), a two-phase framework for improving LLM-based rubric grading. The first phase (EDIT-SFT) identifies problematic reasoning steps using posterior belief signals and input-grounding scores, then revises only those steps with rubric checklists; the second phase (EDIT-RL) uses belief-guided reward shaping to penalize harmful belief drifts during RL. Experiments on two real-world multi-subject grading benchmarks show consistent improvements over SFT and RL baselines on both in-domain and out-of-domain splits.

Evaluation and Benchmarking Alignment and RLHF Evidence-Diagnosed Intervention Training EDIT: Evidence-Diagnosed Intervention Training for Rule-Faithful LLM Grading