technique
Evolved Policy Gradients
techniqueactive
evolved-policy-gradients-ab4538b4·1 events·first seen 28d agoAliases: Evolved Policy Gradients
Co-occurring entities
More like this (12)
Policy Gradient Methodspolicy gradientIntegrated GradientsGRPO (Group Relative Policy Optimization)Dual-Evidence Gradient PurificationProximal Policy OptimizationWasserstein Policy Gradientgradient accumulationEvolution StrategiesDivergence Regularized Policy OptimizationGradient LabsDenoising Diffusion Policy Optimization
Recent events (1)
Evolved Policy Gradients: OpenAI Meta-Learning via Loss Function Evolution
OpenAI released Evolved Policy Gradients (EPG), a meta-learning method that evolves the loss function used to train reinforcement learning agents rather than hand-designing it. The approach enables faster adaptation to novel tasks, with agents demonstrating generalization to test-time scenarios outside their training distribution, such as navigating to objects placed in new locations. EPG represents an experimental direction in automated algorithm discovery for RL.