Almanac
technique

Action-Dependent Factorized Baselines

techniqueactiveaction-dependent-factorized-baselines-ccb6691e·1 events·first seen 28d ago

Aliases: Action-Dependent Factorized Baselines

Co-occurring entities

More like this (12)

Recent events (1)

3Openai Blog·28d ago·source ↗

Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines

OpenAI published a research paper on variance reduction techniques for policy gradient methods in reinforcement learning. The work introduces action-dependent factorized baselines as a way to reduce variance in policy gradient estimates without introducing bias. This is a foundational RL training methodology contribution relevant to improving sample efficiency in reinforcement learning.