paper

The Role of Feedback Alignment in Self-Distillation

paperactiveprovisionalthe-role-of-feedback-alignment-in-self-distillation-76918a3a·1 events·first seen 7d ago

Aliases: The Role of Feedback Alignment in Self-Distillation

Co-occurring entities

More like this (12)

Learning from the Self-future: On-policy Self-distillation for dLLMs deliberative alignment on-policy self-distillation Be My Tutor: On-Policy Co-Distillation for Mutual LLM Improvement via Peer Feedback Consistency Training Can Entrench Misalignment Self-Distillation Reinforcement Learning from Rich Feedback with Distributional DAgger On-Policy Co-Distillation post-training alignment self-refinement Weak-to-Strong Distillation on-policy distillation

Recent events (1)

5arXiv · cs.AI·7d ago·source ↗

Step-aligned critique outperforms GRPO and reference-solution conditioning in self-distillation

A new arXiv paper investigates context design for self-distillation of language models, comparing binary reward (GRPO), reference solutions, and step-by-step critiques aligned to the solver's reasoning trace. Step-aligned critique yields the largest gains, outperforming GRPO by 16.11 points and reference-solution conditioning by 5.27 points on Avg@12. Per-token advantage analysis shows that step-aligned feedback targets only failing tokens, avoiding unnecessary pressure on already-correct reasoning steps. The findings suggest structural alignment between feedback and the model's reasoning trace is a key driver of self-distillation effectiveness.

Evaluation and Benchmarking Alignment and RLHF GRPO The Role of Feedback Alignment in Self-Distillation