Entity · product

ReuseRL

productactivereuserl-42c7cb16·1 events·first seen Jun 1, 2026

Aliases: ReuseRL

Co-occurring entities

Minimum Description Length ALFWorld Countdown-Stepwise PAC-Bayes GRPO TextWorld-Cooking

More like this (12)

ContextRL PrefixRL ExpRL CheckRLM MedRLM SafeRL-Lab PipelineRL OpenRLHF RL² MemRL prime-rl VRRL

Recent events (1)

6arXiv · cs.AI·Jun 1, 2026·source ↗

ReuseRL: Skill Reuse as Compression in Agentic RL via MDL Principle

ReuseRL formalizes agentic reinforcement learning through the Minimum Description Length (MDL) principle, extracting a shared skill dictionary from successful trajectories and augmenting the RL objective with a segmentation cost that penalizes idiosyncratic, non-reusable behaviors. The authors prove a PAC-Bayes generalization bound for this compression penalty. Evaluated on ALFWorld, TextWorld-Cooking, and Countdown-Stepwise, ReuseRL outperforms vanilla GRPO and round-length baselines on both in-distribution and out-of-distribution tasks.

Evaluation and Benchmarking Agent and Tool Ecosystem Minimum Description Length ALFWorld Countdown-Stepwise +5 more