← all repositories

FareedKhan-dev/train-deepseek-r1

A Jupyter notebook and guide walking through the step-by-step implementation of DeepSeek R1's training process using GRPO reinforcement learning.

769 stars Jupyter Notebook Language ModelsML FrameworksLearning
train-deepseek-r1
Velocity · 7d
+1.6
★ / day
Trend
steady
star history

The repository provides a hands-on implementation of DeepSeek R1’s training methodology, covering reinforcement learning fundamentals, the GRPO algorithm, reward functions for accuracy and format validation, and policy model setup. It includes explanatory markdown documents with hand-drawn diagrams to help non-technical audiences understand LLM training concepts.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.