6OpenAI Blog·1mo ago

Learning Dexterity: OpenAI Trains Robot Hand for Physical Object Manipulation

OpenAI announced the training of a human-like robot hand capable of manipulating physical objects with what they describe as unprecedented dexterity. The system uses reinforcement learning to develop fine motor control in a dexterous robotic hand. This work represents an early milestone in OpenAI's robotics research program, predating their later Dactyl work on solving Rubik's cubes.

Agent and Tool Ecosystem OpenAI Dexterous Hand Reinforcement Learning OpenAI

Related guides (3)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Reinforcement LearningConcept

Reinforcement Learning: How AI Learns by Doing

Read asBeginner In-depth

Related events (8)

7Openai Blog·1mo ago·source ↗

Solving Rubik's Cube with a Robot Hand via Reinforcement Learning and Automatic Domain Randomization

OpenAI trained neural networks to solve a Rubik's Cube using a dexterous robot hand, with training conducted entirely in simulation via reinforcement learning. A new technique called Automatic Domain Randomization (ADR) enables the system to generalize to real-world physical perturbations not seen during training. The work demonstrates that sim-to-real transfer can achieve unprecedented dexterity in manipulation tasks.

Frontier Model Releases Agent and Tool Ecosystem Automatic Domain Randomization Dactyl OpenAI Five +1 more

5arXiv · cs.LG·8d ago·source ↗

Mana framework achieves zero-shot sim-to-real transfer for dexterous articulated tool manipulation

Researchers introduce Mana (Manipulation Animator), a sim-to-real framework that reframes dexterous robotic manipulation as an animation problem using a coarse-to-fine pipeline of procedurally-generated grasp keyframes, motion planning, and reinforcement learning. The system requires minimal human input (under one minute per tool) and achieves zero-shot sim-to-real transfer across four articulated tools with varying joint types and scales. The work addresses a longstanding gap in dexterous robotics where articulated tool use—requiring coordination of internal degrees of freedom and contact-rich interactions—has been underexplored relative to rigid object manipulation.

Agent and Tool Ecosystem Mana Mana: Dexterous Manipulation of Articulated Tools

4Openai Blog·1mo ago·source ↗

Generalizing from Simulation: OpenAI Sim-to-Real Robotics Transfer

OpenAI published results on sim-to-real transfer for robot controllers, demonstrating that policies trained entirely in simulation can be deployed on physical robots and respond to unplanned environmental changes. The work represents a shift from open-loop to closed-loop control systems in robotics. This is a 2017 research milestone predating current frontier model work but relevant to the historical trajectory of OpenAI's robotics program.

Agent and Tool Ecosystem sim-to-real transfer closed-loop control OpenAI

6Openai Blog·1mo ago·source ↗

Dota 2 with Large Scale Deep Reinforcement Learning

OpenAI published a detailed account of the OpenAI Five system that defeated world-champion Dota 2 players using large-scale deep reinforcement learning. The work describes the training infrastructure, self-play curriculum, and scaling properties that enabled superhuman performance in a complex multi-agent environment. This represents a landmark result in applying RL at scale to long-horizon, high-dimensional tasks.

Training Infrastructure AI Safety Research OpenAI Five Dota 2 Proximal Policy Optimization +1 more

6Openai Blog·1mo ago·source ↗

OpenAI Five Defeats Amateur Human Teams at Dota 2

OpenAI announced that OpenAI Five, a team of five neural networks trained via self-play, has begun defeating amateur human teams at Dota 2. This represented an early milestone in applying reinforcement learning to complex, long-horizon multi-agent environments. The system was trained using large-scale distributed RL, demonstrating that neural networks could coordinate in real-time strategy games without hand-crafted rules.

Evaluation and Benchmarking Agent and Tool Ecosystem OpenAI Five Dota 2 Proximal Policy Optimization +1 more

5Openai Blog·1mo ago·source ↗

Competitive Self-Play Enables Emergent Physical Skills in Simulated Agents

OpenAI demonstrates that competitive self-play allows simulated agents to spontaneously develop complex physical skills—tackling, ducking, faking, kicking, catching, and diving—without explicit environment design for those behaviors. The self-play dynamic automatically calibrates difficulty to the agent's current skill level. Combined with concurrent Dota 2 self-play results, OpenAI expresses confidence that self-play will be a foundational component of powerful AI systems.

Agent and Tool Ecosystem Alignment and RLHF Dota 2 Competitive Self-Play OpenAI

5arXiv · cs.AI·1mo ago·source ↗

DexHoldem: A Real-World Benchmark for Dexterous Embodied Agents Using Texas Hold'em Manipulation

DexHoldem is a new system-level benchmark for evaluating dexterous embodied agents on a ShadowHand robot performing Texas Hold'em card manipulation tasks. It provides 1,470 teleoperated demonstrations across 14 manipulation primitives, a physical policy benchmark, and an agentic perception benchmark for structured game-state recovery. Top performers include π₀.₅ at 61.2% task completion and Claude Opus 4.7 at 34.3% strict perception accuracy, with GPT 5.5 achieving 66.8% field-wise accuracy. The benchmark exposes gaps between isolated visual sub-capabilities and full closed-loop embodied decision-making.

Evaluation and Benchmarking Agent and Tool Ecosystem Claude Opus 4.6 π₀.₅ Physical Intelligence +4 more

5Openai Blog·1mo ago·source ↗

OpenAI Releases RL-Teacher: Open-Source Human Feedback Interface for RL

OpenAI released RL-Teacher, an open-source implementation of an interface for training AI systems using occasional human feedback instead of hand-crafted reward functions. The tool implements a technique developed as a step toward safer AI systems and is applicable to reinforcement learning problems where reward specification is difficult. This represents an early public release of human-in-the-loop RL tooling from OpenAI.

AI Safety Research Agent and Tool Ecosystem RL-Teacher Reinforcement Learning from Human Feedback OpenAI +1 more