← all repositories

huawei-csl/SINQ

SINQ is a plug-and-play quantization technique that makes any LLM significantly smaller and faster to run without sacrificing accuracy.

SINQ
Velocity · 7d
+2.4
★ / day
Trend
steady
star history

SINQ (Sinkhorn-Normalized Quantization) is a calibration-free low-precision quantization method for LLMs. It uses optimal transport theory and Sinkhorn normalization to quantize model weights while maintaining model quality. The method is model-agnostic and works with various LLMs including Qwen and DeepSeek models. SINQ has been accepted at ICML 2026 and is natively integrated into HuggingFace Transformers via SinqConfig, enabling simplified deployment of quantized models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.