neuralmagic/sparseml
A Python library for applying sparsification recipes (pruning, quantization) to neural networks with integrations for PyTorch, TensorFlow, and Keras.

SparseML provides APIs and recipes for compressing neural networks to reduce their size and improve inference speed with minimal accuracy loss. It offers one-line integration with existing deep learning workflows and supports model optimization for computer vision, NLP, and other deep learning tasks. The library includes pre-built sparsification recipes and tools for creating custom optimization recipes targeting various model architectures.