A curated survival guide for shrinking neural nets onto silicon
A reading list that tracks how to compress models and cram them onto FPGAs/ASICs without the marketing fluff.

What it does This repo is a meticulously organized bibliography of research papers focused on two practical problems: making deep neural networks smaller (compression, pruning, quantization, distillation) and running them efficiently on custom hardware (FPGA/ASIC accelerators). It covers work from 2016–2019 conferences and includes canonical survey papers as tutorials.
The interesting bit The maintainer flags their own field as “changing rapidly” and admits entries may be “somewhat antiquated” — a rare honesty in ML curation. The hardware section notably splits out RNN-specific accelerators as its own deep category, reflecting how recurrent architectures pose distinct memory and parallelism challenges compared to CNNs.
Key highlights
- Network compression: Covers six major strategies — parameter sharing, knowledge distillation, fixed-precision/binarized networks, sparsity/pruning, tensor decomposition, and adaptive/conditional computing
- Hardware accelerators: Dedicated sections for RNN and CNN accelerators, plus benchmarking suites (MLPerf, DAWNBench, DeepBench)
- Conference tracking: Papers organized by venue and year (NIPS 2016 through SysML 2019)
- Canonical tutorials: Links to Sze et al.’s hardware survey and Cheng et al.’s compression survey as entry points
- “Our Contributions” section: Currently marked TODO — this is a reading list, not original research
Caveats
- README explicitly warns that entries may be outdated; last major update appears around 2019
- CNN accelerator section defers entirely to another repo (Neural-Networks-on-Silicon) rather than curating its own
- No code, implementations, or reproduced results — purely a paper index
Verdict Worth bookmarking if you’re entering model compression or neural network hardware and need a structured map of the 2016–2019 literature. Skip it if you want implementations, recent post-2019 work, or guidance on which techniques actually work in practice.