Almanac
← Events
5arXiv cs.LG (Machine Learning)·3h ago

AutoDex: Automated real-world dexterous grasping data collection with 4.8x throughput over teleoperation

AutoDex is an automated system for collecting physically-labeled dexterous grasping data at scale, closing the perception-execution-labeling-reset loop without human intervention. The system uses 20-camera dense perception to handle hand-object occlusion, executes collision-monitored motions, and actively resets objects between trials. Across 100 objects and two robot hand platforms, AutoDex achieves 4.8x throughput versus teleoperation and yields 76% grasp success from its validated database versus 34% for simulation-only validation. Code and data will be publicly released.

Related guides (1)

Related events (8)

6arXiv · cs.LG·3h ago·source ↗

CoorDex: Learning pipeline for continuous dexterous humanoid loco-manipulation with high-DoF hands

CoorDex is a reinforcement learning pipeline that enables humanoid robots to perform dexterous manipulation while walking, eliminating the stop-and-go pattern common in prior work. The approach trains separate privileged motion tracking teachers for body and hand, distills them into latent priors, and uses coordinated residual RL to compose them for downstream tasks. Demonstrated on a Unitree G1 humanoid with a 20-DoF WUJI hand, the system achieves non-stop bottle grasping, fridge door opening, and cube manipulation in motion. Ablations show that naive joint-space or monolithic approaches fail under the same reward budget, validating the latent-prior architecture.

5arXiv · cs.AI·1mo ago·source ↗

DexHoldem: A Real-World Benchmark for Dexterous Embodied Agents Using Texas Hold'em Manipulation

DexHoldem is a new system-level benchmark for evaluating dexterous embodied agents on a ShadowHand robot performing Texas Hold'em card manipulation tasks. It provides 1,470 teleoperated demonstrations across 14 manipulation primitives, a physical policy benchmark, and an agentic perception benchmark for structured game-state recovery. Top performers include π₀.₅ at 61.2% task completion and Claude Opus 4.7 at 34.3% strict perception accuracy, with GPT 5.5 achieving 66.8% field-wise accuracy. The benchmark exposes gaps between isolated visual sub-capabilities and full closed-loop embodied decision-making.

6Openai Blog·1mo ago·source ↗

Learning Dexterity: OpenAI Trains Robot Hand for Physical Object Manipulation

OpenAI announced the training of a human-like robot hand capable of manipulating physical objects with what they describe as unprecedented dexterity. The system uses reinforcement learning to develop fine motor control in a dexterous robotic hand. This work represents an early milestone in OpenAI's robotics research program, predating their later Dactyl work on solving Rubik's cubes.

5arXiv · cs.LG·10d ago·source ↗

Mana framework achieves zero-shot sim-to-real transfer for dexterous articulated tool manipulation

Researchers introduce Mana (Manipulation Animator), a sim-to-real framework that reframes dexterous robotic manipulation as an animation problem using a coarse-to-fine pipeline of procedurally-generated grasp keyframes, motion planning, and reinforcement learning. The system requires minimal human input (under one minute per tool) and achieves zero-shot sim-to-real transfer across four articulated tools with varying joint types and scales. The work addresses a longstanding gap in dexterous robotics where articulated tool use—requiring coordination of internal degrees of freedom and contact-rich interactions—has been underexplored relative to rigid object manipulation.

5arXiv · cs.AI·25d ago·source ↗

Beyond Binary: Sim-to-Real Dexterous Manipulation with Physics-Grounded Contact Representation (CoP)

Researchers introduce Center-of-Pressure (CoP), a tactile representation grounded in physical principles designed to bridge the sim-to-real gap in contact-rich dexterous manipulation. CoP preserves dense contact information while remaining robust for sim-to-real transfer, supported by a differentiable-dynamics-based sensor calibration scheme that estimates taxel orientations without ground-truth force measurements. Evaluated on peg-in-hole insertion and ball balancing tasks, CoP-conditioned policies achieve zero-shot sim-to-real transfer on a multi-fingered robotic hand, outperforming binary-contact and raw-taxel baselines. An emergent finding is that CoP-conditioned policies implicitly encode task-relevant physical properties such as object mass.

4Github Trending·29d ago·source ↗

Dexter: Autonomous Agent for Deep Financial Research (TypeScript)

Dexter is an open-source TypeScript project implementing an autonomous agent designed for deep financial research. The repository has accumulated 26,409 stars with 237 added today, indicating significant community interest. It represents a practical deployment of agent tooling in the financial domain.

7Openai Blog·1mo ago·source ↗

Solving Rubik's Cube with a Robot Hand via Reinforcement Learning and Automatic Domain Randomization

OpenAI trained neural networks to solve a Rubik's Cube using a dexterous robot hand, with training conducted entirely in simulation via reinforcement learning. A new technique called Automatic Domain Randomization (ADR) enables the system to generalize to real-world physical perturbations not seen during training. The work demonstrates that sim-to-real transfer can achieve unprecedented dexterity in manipulation tasks.

6arXiv · cs.LG·24d ago·source ↗

DynaFLIP: Dynamics-Aware Multimodal Pre-Training for Robot Manipulation Perception

DynaFLIP is a pre-training framework that injects motion understanding into visual encoders for robot manipulation by constructing image-language-3D flow triplets from human and robot videos. The method encourages tri-modal alignment via simplex-volume minimization in a shared hyperspherical space, combined with cosine regularization and contrastive objectives. The resulting dynamics-aware visual backbone consistently outperforms baselines across diverse downstream policies including VLAs, with gains up to +22.5% in out-of-distribution scenarios. The work argues that robot generalization requires encoding how the world changes under action, not just static scene content.