google/XNNPACK
A Google library providing highly optimized neural network inference operators for ARM, x86, WebAssembly, and RISC-V platforms.

XNNPACK provides low-level performance primitives for neural network inference, implementing operators like convolution, pooling, activation functions, and quantization. It targets mobile, server, and web platforms, serving as a backend acceleration layer for high-level ML frameworks including TensorFlow Lite, PyTorch, ONNX Runtime, and MediaPipe. The library supports SIMD optimizations across multiple architectures and includes fixed-point quantization for efficient deployment.