← all repositories

jzhang38/TinyLlama

A project to pretrain a 1.1B parameter open-source Llama-style language model on 3 trillion tokens using 16 A100 GPUs.

TinyLlama
Velocity · 7d
+8.9
★ / day
Trend
steady
star history

The TinyLlama project trains a compact 1.1B parameter language model following the Llama 2 architecture and tokenizer. Training uses 16 A100-40G GPUs over approximately 90 days to process 3 trillion tokens. The project releases intermediate checkpoints on HuggingFace, provides chat demos, includes speculative decoding examples with llama.cpp, and offers fine-tuning scripts for customizing the base model.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.