← all repositories

therealoliver/Deepdive-llama3-from-scratch

Educational project implementing the Llama3 transformer architecture from scratch in Python notebooks.

629 stars Jupyter Notebook Language ModelsLearning
Deepdive-llama3-from-scratch
Velocity · 7d
+1.3
★ / day
Trend
steady
star history

This repository provides a step-by-step walkthrough of implementing Llama3 inference from scratch. It covers all core components of the transformer architecture including multi-head attention with RoPE positional encoding, SwiGLU activation, RMS normalization, KV-cache optimization, and tokenization. The project is structured as Jupyter Notebooks with detailed code annotations and dimension tracking to help learners understand each computation step.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.