
OpenAI Five
openai-five-7d86a928·8 events·first seen 27d agoAliases: OpenAI Five
Co-occurring entities
More like this (12)
Recent events (8)
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI published a detailed account of the OpenAI Five system that defeated world-champion Dota 2 players using large-scale deep reinforcement learning. The work describes the training infrastructure, self-play curriculum, and scaling properties that enabled superhuman performance in a complex multi-agent environment. This represents a landmark result in applying RL at scale to long-horizon, high-dimensional tasks.
OpenAI Five defeats Dota 2 world champions OG in live match
OpenAI Five became the first AI system to defeat world champion esports players in a live public match, winning two consecutive games against Dota 2 world champions OG at Finals in April 2019. This milestone followed earlier private victories by both OpenAI Five and DeepMind's AlphaStar against professional players, but marked the first time an AI beat top esports professionals on livestream. The result represents a significant benchmark in multi-agent reinforcement learning applied to complex real-time strategy games.
OpenAI Five Defeats 99.95th Percentile Dota 2 Players in Live Benchmark Match
OpenAI Five won a best-of-three series against a team of five high-ranked Dota 2 players, four of whom are professional players, in a live event watched by approximately 100,000 concurrent viewers. The match was framed as a benchmark result demonstrating the system's capability against near-top-tier human competition. This represents a milestone in the ongoing development of OpenAI's reinforcement learning-based Dota 2 agent.
OpenAI Five Defeats Amateur Human Teams at Dota 2
OpenAI announced that OpenAI Five, a team of five neural networks trained via self-play, has begun defeating amateur human teams at Dota 2. This represented an early milestone in applying reinforcement learning to complex, long-horizon multi-agent environments. The system was trained using large-scale distributed RL, demonstrating that neural networks could coordinate in real-time strategy games without hand-crafted rules.
OpenAI Five Loses Two Games Against Top Dota 2 Players at The International 2018
OpenAI Five competed against top professional Dota 2 players at The International 2018 in Vancouver, losing both games. Despite the losses, the system remained competitive for the first 20–35 minutes of each game, demonstrating meaningful progress in multi-agent reinforcement learning for complex real-time strategy environments.
More on Dota 2: OpenAI Self-Play Reaches Superhuman Performance
OpenAI reports that a self-play reinforcement learning system progressed from below high-ranked human level to beating top professional Dota 2 players within one month, using only 1v1 mid-lane play. The post highlights self-play as a mechanism that automatically improves training data quality as the agent improves, contrasting it with supervised learning's dependence on fixed datasets. The result is presented as evidence that sufficient compute combined with self-play can rapidly close and exceed human-level performance gaps.
Solving Rubik's Cube with a Robot Hand via Reinforcement Learning and Automatic Domain Randomization
OpenAI trained neural networks to solve a Rubik's Cube using a dexterous robot hand, with training conducted entirely in simulation via reinforcement learning. A new technique called Automatic Domain Randomization (ADR) enables the system to generalize to real-world physical perturbations not seen during training. The work demonstrates that sim-to-real transfer can achieve unprecedented dexterity in manipulation tasks.
Learning Montezuma's Revenge from a Single Demonstration
OpenAI trained a reinforcement learning agent to achieve a score of 74,500 on Montezuma's Revenge using a single human demonstration, surpassing all previously published results. The method is straightforward: the agent plays episodes starting from carefully selected states drawn from the demonstration, optimizing game score via PPO. This approach demonstrates that imitation-seeded curriculum learning can dramatically improve exploration in hard-exploration environments. The same PPO algorithm underpins OpenAI Five.