srush/LLM-Training-Puzzles
An 8-puzzle interactive tutorial by Sasha Rush teaching distributed training of large language models across many GPUs.

Velocity · 7d
+1.1
★ / day
Trend
→steady
star history
This repository contains a set of interactive Jupyter notebooks with challenges covering the technical aspects of training LLMs on thousands of GPUs. Topics include memory efficiency strategies and compute pipelining techniques for distributed neural network training. The puzzles serve as hands-on education for practitioners who rarely get to work with large-scale GPU clusters.