horseee/LLM-Pruner
A structural pruning framework for compressing large language models.

Velocity · 7d
+1.0
★ / day
Trend
→steady
star history
LLM-Pruner performs structural pruning on LLMs to reduce model size and computational requirements while preserving functionality. It supports Llama, Llama-2, Llama-3, BLOOM, Vicuna, ChatGLM, Baichuan, and TinyLlama architectures. The approach uses gradient-based importance estimation to identify and remove structural components.