Almanac
technique

Adaptive Data Scheduling

techniqueactiveprovisionaladaptive-data-scheduling-1c45c1e8·1 events·first seen 43h ago

Aliases: Adaptive Data Scheduling

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·43h ago·source ↗

Adaptive Data Scheduling (ADS) improves LLM reinforcement learning post-training by 5.2% over GRPO

Researchers propose Adaptive Data Scheduling (ADS), a dual-level framework that replaces uniform sampling in RL post-training with adaptive distribution over semantic clusters and policy-boundary sample selection. Evaluated across three LLMs and seven reasoning benchmarks, ADS improves average accuracy by 5.2% over GRPO and generalizes across RL objectives. The method addresses a structural limitation in standard RL post-training pipelines by accounting for semantic data structure and evolving policy capability during training.