← all repositories
google/qkeras

Shrink your neural nets without rewriting them

QKeras lets you quantize Keras models by swapping layer names—QConv2D for Conv2D, QDense for Dense—and keeps training mostly intact.

qkeras
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

QKeras is a Google-built extension that adds quantized drop-in replacements for standard Keras layers. You swap Conv2D for QConv2D, Dense for QDense, and so on, then specify how many bits you want for weights, biases, and activations. The library handles the arithmetic quantization so you can train low-precision networks without leaving the Keras API.

The interesting bit

The project bundles two less-obvious tools that bridge the gap between “it trains” and “it ships.” QTools generates hardware data-type maps and estimates energy consumption in picojoules based on a 45nm reference model—useful for comparing quantized architectures before you touch an FPGA. AutoQKeras automates the bit-width search across layers using Keras Tuner, treating precision as a hyperparameter rather than a manual guessing game.

Key highlights

  • Drop-in quantized layers for CNNs, separable convolutions, RNNs/LSTMs/GRUs, and even bidirectional wrappers
  • Stochastic and ternary quantizers (stochastic_ternary, binary, quantized_po2) for aggressive compression
  • QTools: JSON data-type maps and energy estimates for hardware targeting
  • AutoQKeras: automated bit-width search with distributed training support
  • Born from particle-physics research—used for low-latency edge inference on detector data

Caveats

  • QBatchNormalization is explicitly marked experimental; the authors note they rarely need it because stochastic activations already regularize
  • Some layer wrappers (like Bidirectional) have rough edges—workarounds exist but the interface may change
  • Energy estimates are relative-comparison tools, not absolute predictions for your specific process node

Verdict

Worth a look if you’re squeezing Keras models onto FPGAs or ASICs and want to keep your training code recognizable. Less compelling if you’re already committed to PyTorch or need production-grade quantization with guaranteed bit-exact inference.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.