Karpathy's LLM course: ambitious syllabus, empty classroom
A from-scratch LLM curriculum that promises to take you from bigrams to diffusion transformers—if it ever ships.

What it does
This is the landing page for LLM101n, a planned course by Andrej Karpathy that would walk learners through building a complete “Storyteller” LLM from the absolute basics up to a ChatGPT-like web app. The syllabus covers 17 chapters spanning language modeling fundamentals, transformer architecture, training optimization, inference tricks, fine-tuning with RLHF, and even multimodal generation.
The interesting bit
The scope is deliberately maximalist: Python to C to CUDA, assembly to diffusion transformers, all supposedly with “minimal computer science prerequisites.” It’s the pedagogical inverse of the usual “import transformers” tutorial—more like building a rocket to understand why airplanes fly.
Key highlights
- 17-chapter progression from bigram models through distributed training, quantization, and multimodal VQVAE/diffusion
- Explicit low-level focus: hand-coded backprop, custom CUDA kernels, mixed precision, DDP/ZeRO
- Targets deployment reality with API construction, web app, kv-cache optimization, and LoRA/RLHF fine-tuning
- Appendix sketches even deeper cuts: assembly programming, tensor memory layouts, GPT-1 through Llama-3 architecture evolution
- Backed by Eureka Labs, Karpathy’s own AI education company
Caveats
- The course does not exist yet. The repo is archived until development finishes; no timeline is given
- Syllabus is aspirational: 17 chapters plus extensive appendix topics may be optimistic for a single course
- No visible code, exercises, or video content—just a README and a Feynman quote
Verdict
Worth bookmarking if you learn by building from first principles and can wait indefinitely. If you need working LLM knowledge this quarter, stick to Karpathy’s existing nanoGPT and videos—this is a promise, not a product.