Almanac
technique

SAW

techniqueactiveprovisionalsaw-27135636·1 events·first seen 19d ago

Aliases: SAW

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·19d ago·source ↗

Demystifying Data Organization for Enhanced LLM Training

This Microsoft Research paper systematically investigates how data organization—distinct from data selection—affects LLM training efficiency across pre-training and SFT stages. The authors formalize four guidelines (Boundary Sharpening, Cyclic Scheduling, Curriculum Continuity, and Local Diversity) and introduce two novel data ordering methods, STR and SAW, that reuse pre-computed sample-level scores with minimal additional overhead. Experiments across multiple model scales and dataset sizes demonstrate improved training stability and performance, with code released publicly.