HuggingFace's open recipe for cloning DeepSeek-R1
A community effort to reverse-engineer and openly reproduce the training pipeline behind DeepSeek's famous reasoning model.

What it does Open R1 is a work-in-progress toolkit that aims to rebuild the entire DeepSeek-R1 pipeline in the open. It provides training scripts for supervised fine-tuning and GRPO reinforcement learning, plus data generation tools that use Distilabel to distill reasoning traces from DeepSeek-R1 itself. A Makefile ties the steps together so you can run the pipeline without memorizing long shell commands.
The interesting bit The project is deliberately simple—just three core scripts and a Makefile—because the real work is in the data and the recipes. They have already completed “Step 1” by releasing Mixture-of-Thoughts, a 350k-sample verified reasoning dataset, and training a 7B model that matches DeepSeek’s distilled version on math and coding benchmarks.
Key highlights
- Releases curated datasets: Mixture-of-Thoughts (350k traces), CodeForces-CoTs (10k problems, 100k solutions), and OpenR1-Math-220k
- OpenR1-Distill-7B scores 52.7 on AIME 2024 versus DeepSeek’s 51.3, and 89.0 on MATH-500 versus 93.5
- Supports SFT and GRPO training via Accelerate + DeepSpeed ZeRO-2/3, with vLLM backend for scalable generation
- Single-node and multi-node Slurm recipes provided, including colocated vLLM mode for smaller models
- Data generation recipes to distill from either small models or the full DeepSeek-R1
Caveats
- Requires CUDA 12.4 and PyTorch 2.6.0; version mismatches cause segmentation faults
- Training configs target 8× H100 (80GB) nodes; you’ll need to retune batch sizes for other hardware
- Chat template and EOS token handling is finicky and varies by base model (Qwen, Llama, etc.)
- Steps 2 and 3 (pure RL pipeline, full multi-stage training) are still incomplete
Verdict Worth a look if you’re researching reasoning models or need a reproducible baseline for distillation. Skip it if you want a polished, end-to-end product—this is explicitly a construction site, not a finished building.