huggingface/optimum
A Hugging Face library that accelerates ML model inference and training through hardware-specific optimizations and quantization.

Velocity · 7d
+1.9
★ / day
Trend
→steady
star history
Optimum extends popular ML libraries (Transformers, Diffusers, TIMM, Sentence-Transformers) with optimization tools for efficient model deployment. It provides hardware-specific acceleration for Intel, GraphCore, and Habana processors, supports quantization techniques, ONNX export, and ONNX Runtime execution. The library aims to maximize inference and training efficiency while maintaining ease of use through a unified API.