← all repositories

bitsandbytes-foundation/bitsandbytes

PyTorch library for k-bit quantization of large language models, providing 8-bit optimizers, LLM.int8() inference, and QLoRA 4-bit training.

bitsandbytes
Velocity · 7d
+4.5
★ / day
Trend
steady
star history

Bitsandbytes provides memory-efficient quantization techniques for large language models in PyTorch. It uses 8-bit block-wise optimizers to reduce memory during training while maintaining 32-bit performance. For inference, the LLM.int8() method uses vector-wise quantization to cut memory requirements in half without performance loss. QLoRA enables 4-bit quantized model training by combining low-rank adaptation with aggressive quantization, dramatically reducing VRAM needs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.