Entity · paper

An Agency-Transferring Model-Free Policy Enhancement Technique

paperactivean-agency-transferring-model-free-policy-enhancement-technique-ff731883·1 events·first seen Jun 9, 2026

Aliases: An Agency-Transferring Model-Free Policy Enhancement Technique

More like this (12)

APPO: Agentic Procedural Policy Optimization Preference Coordinated Multi-agent Policy Optimization Role-Aware Policy Optimization GraphPO: Graph-based Policy Optimization for Reasoning Models OR Else: A Differentiable Trust Region for Policy Optimization Vector Policy Optimization Model-Generated Agent Skills (paper)Learning Red Agent Policy from Observations for Neurosymbolic Autonomous Cyber Agents Routing-based On-Policy Distillation Pass the Baton: Trajectory-Relayed On-Policy Distillation Evolved Policy Gradients on-policy distillation

Recent events (1)

4arXiv · cs.LG·Jun 9, 2026·source ↗

Agency-transferring technique improves RL policy training by bootstrapping from baseline policies

A new arXiv paper proposes a model-free reinforcement learning method that embeds an existing suboptimal baseline policy into training via an arbitration mechanism, progressively transferring control from the baseline to a trainable neural network. The approach yields high goal-reaching rates from the start of training and produces a standalone policy that outperforms the baseline without requiring it at inference time. Theoretical bounds on goal-reaching probability are derived, and empirical results on continuous-control benchmarks show competitive or superior returns compared to existing methods.

Alignment and RLHF An Agency-Transferring Model-Free Policy Enhancement Technique