← all repositories
Xilinx/brevitas

Xilinx's side project for shrinking neural nets

A PyTorch library that lets you quantize layers individually without rewriting your model from scratch.

brevitas
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does Brevitas provides quantized drop-in replacements for standard PyTorch layers—QuantConv2d, QuantLSTM, QuantMultiheadAttention, and others—so you can apply post-training quantization (PTQ) or quantization-aware training (QAT) without abandoning familiar APIs. Each tensor (weights, inputs, bias, outputs) gets its own tunable quantization settings.

The interesting bit The granularity is the selling point: you can tune bit-width and scale per-layer rather than accepting one-size-fits-all quantization. There’s also a reference PTQ pipeline for ImageNet models if you want to see how a torchvision model behaves at 4-bit versus 8-bit.

Key highlights

  • Drop-in quantized variants of common torch.nn layers under brevitas.nn
  • Independent quantization config for weights, activations, bias, and outputs
  • Supports both PTQ and QAT workflows
  • Reference example for ImageNet classification PTQ included
  • Available on PyPI; supports Python 3.9–3.12 and PyTorch 1.12–2.8

Caveats

  • Explicitly labeled a research project, not an official Xilinx product
  • PyTorch versions beyond 2.8 are untested, so bleeding-edge installs may break

Verdict Worth a look if you’re shipping models to FPGA or edge hardware and need fine-grained control over quantization tradeoffs. Skip it if you want a polished, vendor-supported toolchain with guaranteed upstream compatibility.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.