Meta's original Llama repo is now a redirect sign
The 59k-star repository that launched the open-weights era has been deprecated and split into five successor projects.

What it does
This was Meta’s original reference implementation for running Llama 2 inference locally — minimal PyTorch code, a shell script for downloading weights via signed URLs, and example scripts for text and chat completion. It supported 7B, 13B, and 70B parameter models with hardcoded model-parallel requirements (1, 2, and 8 GPUs respectively).
The interesting bit
The README now functions as an archaeological site. Meta has fragmented the monorepo into a “Llama Stack” — model weights live in llama-models, safety tooling in PurpleLlama, inference interfaces in llama-toolchain, agentic systems in llama-agentic-system, and community recipes in llama-cookbook. The original repo’s value was always its simplicity; that simplicity has been deliberately dismantled.
Key highlights
- Deprecated as of Llama 3.1 release; active development moved to five successor repositories
- Originally provided bare-bones inference for 7B–70B models with
torchrunand manual MP configuration - Weight downloads required Meta license approval via email with 24-hour expiring signed URLs
- Chat models required rigid prompt formatting with
INST,<<SYS>>, and specific whitespace handling - Hugging Face mirror available with ~1 hour approval turnaround
Caveats
- The deprecation notice is prominent but easy to miss if you land here from an old link or citation
- No migration guide provided; you’re expected to self-navigate the five replacement repos
- Original download mechanism (email URLs,
wget,md5sum) feels archaic by current standards
Verdict
Worth visiting only for historical context or if you’re maintaining legacy Llama 2 infrastructure. New users should start with llama-models or llama-toolchain instead. Researchers tracing the evolution of open-weights release practices may find the README’s transformation instructive.