product
ReuseRL
productactiveprovisional
reuserl-42c7cb16·1 events·first seen 16d agoAliases: ReuseRL
Co-occurring entities
More like this (12)
Recent events (1)
ReuseRL: Skill Reuse as Compression in Agentic RL via MDL Principle
ReuseRL formalizes agentic reinforcement learning through the Minimum Description Length (MDL) principle, extracting a shared skill dictionary from successful trajectories and augmenting the RL objective with a segmentation cost that penalizes idiosyncratic, non-reusable behaviors. The authors prove a PAC-Bayes generalization bound for this compression penalty. Evaluated on ALFWorld, TextWorld-Cooking, and Countdown-Stepwise, ReuseRL outperforms vanilla GRPO and round-length baselines on both in-distribution and out-of-distribution tasks.