HuangOwen/Awesome-LLM-Compression
A categorized collection of research papers and software tools for compressing large language models.

Velocity · 7d
+1.7
★ / day
Trend
→steady
star history
This repository aggregates academic papers and software tools focused on compressing LLMs to accelerate both training and inference. It covers major compression techniques including quantization, pruning and sparsity, knowledge distillation, efficient prompting strategies, and KV cache compression. The repository also lists relevant software tools and maintains contributions from the research community.