therealoliver/Deepdive-llama3-from-scratch
Educational project implementing the Llama3 transformer architecture from scratch in Python notebooks.

Velocity · 7d
+1.3
★ / day
Trend
→steady
star history
This repository provides a step-by-step walkthrough of implementing Llama3 inference from scratch. It covers all core components of the transformer architecture including multi-head attention with RoPE positional encoding, SwiGLU activation, RMS normalization, KV-cache optimization, and tokenization. The project is structured as Jupyter Notebooks with detailed code annotations and dimension tracking to help learners understand each computation step.