5OpenAI Blog·1mo ago

Sim-to-real transfer of robotic control with dynamics randomization

OpenAI published research on transferring robotic control policies trained in simulation to real-world robots using dynamics randomization. The technique involves varying physical parameters during simulation training so that the real world appears as just another variation, enabling zero-shot sim-to-real transfer. This was an early foundational contribution to the sim-to-real robotics research thread.

Agent and Tool Ecosystem dynamics randomization sim-to-real transfer OpenAI

Related guides (2)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Related events (8)

4Openai Blog·1mo ago·source ↗

Generalizing from Simulation: OpenAI Sim-to-Real Robotics Transfer

OpenAI published results on sim-to-real transfer for robot controllers, demonstrating that policies trained entirely in simulation can be deployed on physical robots and respond to unplanned environmental changes. The work represents a shift from open-loop to closed-loop control systems in robotics. This is a 2017 research milestone predating current frontier model work but relevant to the historical trajectory of OpenAI's robotics program.

Agent and Tool Ecosystem sim-to-real transfer closed-loop control OpenAI

7Openai Blog·1mo ago·source ↗

Solving Rubik's Cube with a Robot Hand via Reinforcement Learning and Automatic Domain Randomization

OpenAI trained neural networks to solve a Rubik's Cube using a dexterous robot hand, with training conducted entirely in simulation via reinforcement learning. A new technique called Automatic Domain Randomization (ADR) enables the system to generalize to real-world physical perturbations not seen during training. The work demonstrates that sim-to-real transfer can achieve unprecedented dexterity in manipulation tasks.

Frontier Model Releases Agent and Tool Ecosystem Automatic Domain Randomization Dactyl OpenAI Five +1 more

4arXiv · cs.LG·11d ago·source ↗

Agency-transferring technique improves RL policy training by bootstrapping from baseline policies

A new arXiv paper proposes a model-free reinforcement learning method that embeds an existing suboptimal baseline policy into training via an arbitration mechanism, progressively transferring control from the baseline to a trainable neural network. The approach yields high goal-reaching rates from the start of training and produces a standalone policy that outperforms the baseline without requiring it at inference time. Theoretical bounds on goal-reaching probability are derived, and empirical results on continuous-control benchmarks show competitive or superior returns compared to existing methods.

Alignment and RLHF An Agency-Transferring Model-Free Policy Enhancement Technique

5arXiv · cs.LG·8d ago·source ↗

Mana framework achieves zero-shot sim-to-real transfer for dexterous articulated tool manipulation

Researchers introduce Mana (Manipulation Animator), a sim-to-real framework that reframes dexterous robotic manipulation as an animation problem using a coarse-to-fine pipeline of procedurally-generated grasp keyframes, motion planning, and reinforcement learning. The system requires minimal human input (under one minute per tool) and achieves zero-shot sim-to-real transfer across four articulated tools with varying joint types and scales. The work addresses a longstanding gap in dexterous robotics where articulated tool use—requiring coordination of internal degrees of freedom and contact-rich interactions—has been underexplored relative to rigid object manipulation.

Agent and Tool Ecosystem Mana Mana: Dexterous Manipulation of Articulated Tools

4Openai Blog·1mo ago·source ↗

Ingredients for robotics research

OpenAI released eight simulated robotics environments and a Baselines implementation of Hindsight Experience Replay (HER), developed over the prior year for internal research. These environments were used to train models that transfer to physical robots. The release also included a set of research requests to guide community contributions in robotics.

Agent and Tool Ecosystem Hindsight Experience Replay OpenAI Baselines OpenAI

6arXiv · cs.AI·1mo ago·source ↗

Mind the Sim-to-Real Gap & Think Like a Scientist: Fisher-SEP for Simulation-Aided Experimental Policy

This paper studies when and how a planner should supplement a pre-trained simulator with real-world experiments in sequential decision problems. The authors decompose simulator value error into a calibration-deployment shift (identifiable via randomization) and an irreducible parametric residual, and show that purely passive learning cannot close the reachability component of the value gap. They propose Fisher-SEP, a simulation-aided experimental policy that minimizes posterior predictive variance of a target policy's value, with case studies in supply chain and HIV mobile-testing domains demonstrating regimes where designed exploration is necessary.

Evaluation and Benchmarking AI Safety Research vending-machine supply chain case study Fisher-SEP HIV mobile-testing case study +5 more

6arXiv · cs.AI·9d ago·source ↗

Ambient Diffusion Policy: imitation learning from suboptimal robot data via noise-dependent co-training

Researchers introduce Ambient Diffusion Policy, a method for robot imitation learning that extracts useful features from suboptimal demonstrations by restricting their contribution to specific diffusion timesteps (high and low noise levels). The approach is grounded in the observation that robot action data follows a spectral power law, inducing global-to-local hierarchy and locality properties in diffusion models. Evaluated across six tasks and four types of suboptimal data, it outperforms co-training baselines by up to 33% when scaled to the Open X-Embodiment dataset.

Training Infrastructure Diffusion Policy Ambient Diffusion Policy Open X-Embodiment

5arXiv · cs.AI·23d ago·source ↗

Beyond Binary: Sim-to-Real Dexterous Manipulation with Physics-Grounded Contact Representation (CoP)

Researchers introduce Center-of-Pressure (CoP), a tactile representation grounded in physical principles designed to bridge the sim-to-real gap in contact-rich dexterous manipulation. CoP preserves dense contact information while remaining robust for sim-to-real transfer, supported by a differentiable-dynamics-based sensor calibration scheme that estimates taxel orientations without ground-truth force measurements. Evaluated on peg-in-hole insertion and ball balancing tasks, CoP-conditioned policies achieve zero-shot sim-to-real transfer on a multi-fingered robotic hand, outperforming binary-contact and raw-taxel baselines. An emergent finding is that CoP-conditioned policies implicitly encode task-relevant physical properties such as object mass.

Evaluation and Benchmarking Agent and Tool Ecosystem multi-fingered dexterous hand Center-of-Pressure (CoP) tactile representation ball balancing +5 more