Huawei's research lab open-sources a decade of model slimming tricks
A monorepo bundling nine different paths to making neural networks cheaper to run, from binary weights to training-free compression.

What it does
This is a curated collection of reference implementations from Huawei Noah’s Ark Lab, covering the full stack of efficient deep learning: quantization, pruning, knowledge distillation, binary networks, self-supervised pre-training, and even a speed-focused YOLO variant. Each subdirectory houses code for specific published methods, many at top-tier venues (NeurIPS, CVPR, ICCV, ECCV).
The interesting bit
The repo treats “efficiency” as a systems problem, not a single technique. Want to compress without touching data? There’s a folder for that. Want to train faster by dynamically expanding network capacity? Also here. It’s less a unified framework and more a research group’s greatest-hits album—with code.
Key highlights
- Data-free compression methods for when you can’t ship training data to the edge
- AdaBin (ECCV 2022) for binary neural networks—1-bit weights, still functional
- Gold-YOLO (NeurIPS 2023), an efficient detector that trades the kitchen-sink YOLO complexity for actual speed
- NetworkExpansion (CVPR 2023), which accelerates training rather than inference—a rarer target
- IPG (CVPR 2024), applying graph neural networks to super-resolution, because convolutions apparently weren’t flexible enough
Caveats
- No top-level documentation or cross-method benchmarks; you’re on your own to compare techniques
- Jupyter Notebook listed as primary language suggests scattered notebooks rather than polished libraries
- Each subproject appears self-contained; don’t expect a single
pip installto rule them all
Verdict
Worth bookmarking if you’re building a model compression pipeline and want to test-drive published methods without reimplementing from scratch. Skip it if you need a unified, maintained toolkit with stable APIs—this is a paper-code dump, not a product.