PrunaAI/pruna
A Python framework that optimizes AI models (LLMs, diffusion, transformers, speech) for speed, size, and efficiency.

Velocity · 7d
+2.7
★ / day
Trend
→steady
star history
Pruna is a model optimization framework for developers that aims to make AI models faster, cheaper, smaller, and greener. It supports various model types including LLMs, diffusion models, transformers, computer vision models, and speech recognition systems. The framework handles tasks such as model quantization, compression, and optimization to reduce computational overhead and improve deployment efficiency.