Zhen-Dong/Awesome-Quantization-Papers
A curated list of academic papers on model quantization techniques for efficient deep learning inference.

Velocity · 7d
+0.5
★ / day
Trend
→steady
star history
This repository aggregates research papers on model quantization published at major AI conferences and journals. It organizes papers by model architecture types (transformers, CNNs) and application domains (vision, language, generation), with keyword labels for quantization methods. The collection includes recent work on LLM quantization, diffusion model compression, and edge deployment techniques.