jzhang38/TinyLlama
A project to pretrain a 1.1B parameter open-source Llama-style language model on 3 trillion tokens using 16 A100 GPUs.

Velocity · 7d
+8.9
★ / day
Trend
→steady
star history
The TinyLlama project trains a compact 1.1B parameter language model following the Llama 2 architecture and tokenizer. Training uses 16 A100-40G GPUs over approximately 90 days to process 3 trillion tokens. The project releases intermediate checkpoints on HuggingFace, provides chat demos, includes speculative decoding examples with llama.cpp, and offers fine-tuning scripts for customizing the base model.