← all repositories

PrunaAI/pruna

A Python framework that optimizes AI models (LLMs, diffusion, transformers, speech) for speed, size, and efficiency.

pruna
Velocity · 7d
+2.7
★ / day
Trend
steady
star history

Pruna is a model optimization framework for developers that aims to make AI models faster, cheaper, smaller, and greener. It supports various model types including LLMs, diffusion models, transformers, computer vision models, and speech recognition systems. The framework handles tasks such as model quantization, compression, and optimization to reduce computational overhead and improve deployment efficiency.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.