Entity · model

OpenAI Five

modelactiveopenai-five-7d86a928·8 events·first seen May 20, 2026

Aliases: OpenAI Five

Co-occurring entities

OpenAI Dota 2 Proximal Policy Optimization self-play Reinforcement Learning PPO Montezuma's Revenge Merlini Cap MoonMeander Fogged Blitz The International 2018 OG DeepMind AlphaStar Automatic Domain Randomization Dactyl

More like this (12)

OpenAI, Inc.OpenAI OpenAI Foundation OpenAI Voice AI OpenAI Frontier OpenAI Gym OpenAI AI Accelerator OpenAI API OpenAI Japan OpenAI for Healthcare OpenAI for Government OpenAI o1-preview

Recent events (8)

6Openai Blog·May 20, 2026·source ↗

More on Dota 2: OpenAI Self-Play Reaches Superhuman Performance

OpenAI reports that a self-play reinforcement learning system progressed from below high-ranked human level to beating top professional Dota 2 players within one month, using only 1v1 mid-lane play. The post highlights self-play as a mechanism that automatically improves training data quality as the agent improves, contrasting it with supervised learning's dependence on fixed datasets. The result is presented as evidence that sufficient compute combined with self-play can rapidly close and exceed human-level performance gaps.

Evaluation and Benchmarking Agent and Tool Ecosystem self-play OpenAI Five Dota 2 +2 more

6Openai Blog·May 20, 2026·source ↗

OpenAI Five Defeats Amateur Human Teams at Dota 2

OpenAI announced that OpenAI Five, a team of five neural networks trained via self-play, has begun defeating amateur human teams at Dota 2. This represented an early milestone in applying reinforcement learning to complex, long-horizon multi-agent environments. The system was trained using large-scale distributed RL, demonstrating that neural networks could coordinate in real-time strategy games without hand-crafted rules.

Evaluation and Benchmarking Agent and Tool Ecosystem OpenAI Five Dota 2 Proximal Policy Optimization +1 more

5Openai Blog·May 20, 2026·source ↗

Learning Montezuma's Revenge from a Single Demonstration

OpenAI trained a reinforcement learning agent to achieve a score of 74,500 on Montezuma's Revenge using a single human demonstration, surpassing all previously published results. The method is straightforward: the agent plays episodes starting from carefully selected states drawn from the demonstration, optimizing game score via PPO. This approach demonstrates that imitation-seeded curriculum learning can dramatically improve exploration in hard-exploration environments. The same PPO algorithm underpins OpenAI Five.

Evaluation and Benchmarking Agent and Tool Ecosystem OpenAI Five PPO OpenAI +1 more

6Openai Blog·May 20, 2026·source ↗

OpenAI Five Defeats 99.95th Percentile Dota 2 Players in Live Benchmark Match

OpenAI Five won a best-of-three series against a team of five high-ranked Dota 2 players, four of whom are professional players, in a live event watched by approximately 100,000 concurrent viewers. The match was framed as a benchmark result demonstrating the system's capability against near-top-tier human competition. This represents a milestone in the ongoing development of OpenAI's reinforcement learning-based Dota 2 agent.

Evaluation and Benchmarking Agent and Tool Ecosystem Merlini OpenAI Five Cap +5 more

5Openai Blog·May 20, 2026·source ↗

OpenAI Five Loses Two Games Against Top Dota 2 Players at The International 2018

OpenAI Five competed against top professional Dota 2 players at The International 2018 in Vancouver, losing both games. Despite the losses, the system remained competitive for the first 20–35 minutes of each game, demonstrating meaningful progress in multi-agent reinforcement learning for complex real-time strategy environments.

Agent and Tool Ecosystem OpenAI Five Dota 2 OpenAI +1 more

7Openai Blog·May 20, 2026·source ↗

OpenAI Five defeats Dota 2 world champions OG in live match

OpenAI Five became the first AI system to defeat world champion esports players in a live public match, winning two consecutive games against Dota 2 world champions OG at Finals in April 2019. This milestone followed earlier private victories by both OpenAI Five and DeepMind's AlphaStar against professional players, but marked the first time an AI beat top esports professionals on livestream. The result represents a significant benchmark in multi-agent reinforcement learning applied to complex real-time strategy games.

Evaluation and Benchmarking Agent and Tool Ecosystem OG DeepMind OpenAI Five +3 more

7Openai Blog·May 20, 2026·source ↗

Solving Rubik's Cube with a Robot Hand via Reinforcement Learning and Automatic Domain Randomization

OpenAI trained neural networks to solve a Rubik's Cube using a dexterous robot hand, with training conducted entirely in simulation via reinforcement learning. A new technique called Automatic Domain Randomization (ADR) enables the system to generalize to real-world physical perturbations not seen during training. The work demonstrates that sim-to-real transfer can achieve unprecedented dexterity in manipulation tasks.

Frontier Model Releases Agent and Tool Ecosystem Automatic Domain Randomization Dactyl OpenAI Five +1 more

6Openai Blog·May 20, 2026·source ↗

Dota 2 with Large Scale Deep Reinforcement Learning

OpenAI published a detailed account of the OpenAI Five system that defeated world-champion Dota 2 players using large-scale deep reinforcement learning. The work describes the training infrastructure, self-play curriculum, and scaling properties that enabled superhuman performance in a complex multi-agent environment. This represents a landmark result in applying RL at scale to long-horizon, high-dimensional tasks.

Training Infrastructure AI Safety Research OpenAI Five Dota 2 Proximal Policy Optimization +1 more