OpenPPL/ppq
A neural network quantization tool for optimizing model inference and deployment on hardware platforms.

PPQ is an extensible, high-performance neural network quantization tool designed for industrial applications. It converts floating-point operations to lower-bit fixed-point representations to reduce memory footprint, improve inference speed, and enable deployment on resource-constrained hardware. The tool supports multiple deep learning frameworks including PyTorch and ONNX, parses and modifies complex neural network architectures, and allows users to control quantization strategies across different hardware platforms.