datawhalechina/diy-llm
A systematic Jupyter Notebook course teaching how to build large language models from scratch, covering pretraining, alignment, and distributed training.

The course provides a structured curriculum for building LLMs from the ground up. It covers core components including tokenizers, transformer architectures, and mixture-of-experts. Students progress through six hands-on assignments covering distributed training, RLHF/SFT alignment techniques, and GPU programming with CUDA/Triton. The material includes inference optimization and scaling laws, offering a complete full-stack understanding of LLM development.