technique
Q-target framework
techniqueactiveprovisional
q-target-framework-76ad0318·1 events·first seen 7d agoAliases: Q-target framework
Co-occurring entities
More like this (12)
Recent events (1)
Q-target framework unifies supervised fine-tuning variants through target distribution design
A new arXiv preprint reframes supervised fine-tuning (SFT) as a problem of target distribution design rather than loss objective selection, introducing the Q-target framework that decomposes SFT supervision into two explicit choices: reliance on the observed token and allocation of remaining probability mass. The authors show that many existing SFT variants can be understood as implicit choices of this target distribution. They propose Target-SFT, which constructs training objectives directly from the desired target distribution, and report consistent improvements across ten reasoning dataset-model settings. The work offers a unifying theoretical lens and opens a broader design space for SFT objectives.