← all repositories

princeton-nlp/LLM-Shearing

A research project that creates efficient smaller language models by structured pruning of larger LLaMA models.

LLM-Shearing
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

Sheared-LLaMA implements structured pruning to accelerate language model pre-training by converting large models (e.g., Llama-2-7B) into smaller but equally capable versions (1.3B, 2.7B parameters) at a fraction of the training cost. The codebase provides pruning and continued pre-training algorithms, releasing both pruned base models and instruction-tuned variants on HuggingFace.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.