MAC-AutoML/MindPipe
A unified compression and evaluation framework for LLMs and vision-language models supporting quantization and pruning across GPU and NPU hardware.

Velocity · 7d
+7.4
★ / day
Trend
→steady
star history
MindPipe provides a single CLI pipeline for compressing large language and vision-language models through quantization and pruning techniques. It implements 11 quantization methods and 7 pruning methods, with integrated evaluation suites including PPL, lm-eval-harness, and VLMEvalKit benchmarks. The framework supports both NVIDIA CUDA GPUs and Huawei Ascend NPUs through a shared device abstraction layer.