datawhalechina/tiny-universe
A from-scratch implementation guide covering Tiny Llama3, Tiny RAG, Tiny Agent, Tiny Diffusion, Tiny Eval, and related LLM system components.

This project is a comprehensive educational guide that walks through implementing large language model systems from first principles. It covers training a small Llama3 model, building RAG and GraphRAG retrieval frameworks, constructing an agent system, implementing diffusion models for image generation, and creating an evaluation toolkit. Each module includes complete code implementations with detailed comments to help learners understand the underlying mechanisms of modern LLM systems.