Entity · technique

importance-weighted supervised fine-tuning

techniqueactiveimportance-weighted-supervised-fine-tuning-aa29d384·1 events·first seen Jun 1, 2026

Aliases: importance-weighted supervised fine-tuning

Co-occurring entities

KL-regularized RL Reinforcement Learning DRIFT

More like this (12)

supervised fine-tuning Parameter-Efficient Fine-Tuning Super-Tuning: From Activation-Aware Pruning to Sparse Fine-Tuning Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes reinforcement fine-tuning FORCE: Efficient VLA Reinforcement Fine-Tuning via Value-Calibrated Warm-up and Self-Distillation A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design Retrieval-Augmented Fine-Tuning Language Model Finetuning behavioral fine-tuning fine-tuning orthogonal finetuning

Recent events (1)

6arXiv · cs.CL·Jun 1, 2026·source ↗

DRIFT: Decoupled Rollouts and Importance-Weighted Fine-Tuning for Efficient Multi-Turn Optimization

DRIFT is a training framework that bridges online RL and offline SFT for multi-turn LLM optimization by exploiting the theoretical equivalence between KL-regularized RL and importance-weighted supervised learning. It decouples rollout generation from policy optimization: trajectories are sampled from a fixed reference policy offline, weighted by return-based importance scores, and used for weighted SFT. Empirically, DRIFT matches or exceeds multi-turn RL baselines while retaining the efficiency and simplicity of standard supervised fine-tuning. Code is publicly released.

Inference Economics Agent and Tool Ecosystem KL-regularized RL Reinforcement Learning DRIFT +2 more