Almanac
technique

TRPO

techniqueactivetrpo-3ee032db·1 events·first seen 28d ago

Aliases: TRPO

Co-occurring entities

More like this (12)

Recent events (1)

3Openai Blog·28d ago·source ↗

OpenAI Baselines: ACKTR & A2C

OpenAI released two new implementations in its Baselines library: A2C, a synchronous deterministic variant of A3C offering equivalent performance, and ACKTR, a more sample-efficient RL algorithm than TRPO and A2C with modest additional compute overhead. These additions expand the reference implementations available for reinforcement learning research. The release is from August 2017 and represents foundational RL tooling from that era.