← all repositories

jax-ml/scaling-book

A blog-style textbook explaining how to scale LLMs on TPUs, covering parallelism strategies for training and inference.

scaling-book
Velocity · 7d
+2.2
★ / day
Trend
steady
star history

The book demystifies scaling LLMs on TPUs by explaining TPU architecture, how LLMs run at scale, and how to select parallelism schemes that avoid communication bottlenecks during training and inference. Written by Google DeepMind researchers, it covers technical topics including roofline analysis, tensor parallelism, and pipeline parallelism for large language model systems.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.