AIoT-MLSys-Lab/Efficient-LLMs-Survey
A comprehensive academic survey on efficiency techniques for large language models, published in TMLR 2024.

Velocity · 7d
+1.1
★ / day
Trend
→steady
star history
This repository hosts a peer-reviewed survey paper covering methods for improving the efficiency of large language models, including techniques for model compression, quantization, distillation, and system optimization. It compiles and categorizes research across training, inference, and architectural improvements for LLMs.