Entity · model

Mini-R1

modelactivemini-r1-98b17325·1 events·first seen May 19, 2026

Aliases: Mini-R1

Co-occurring entities

More like this (12)

o1-mini MiniMax 2.5 Reachy Mini MiniMax Open R1 Palmyra-mini Phi-4-mini o3-mini o4-mini North Mini Code SD-Tiny GPT-4.1 mini

Recent events (1)

5Hugging Face Blog·May 19, 2026·source ↗

Mini-R1: Reproducing DeepSeek R1 'Aha Moment' — An RL Tutorial

A Hugging Face blog post demonstrates how to reproduce DeepSeek R1's emergent 'aha moment' reasoning behavior using reinforcement learning on a countdown game task. The tutorial walks through training a smaller model with RL to exhibit chain-of-thought self-correction, similar to the behavior observed in DeepSeek R1. This serves as a practical open-source replication effort aimed at demystifying R1's training dynamics.

Frontier Model Releases Open Weights Progress DeepSeek V4 GRPO Open R1 +3 more