← all repositories
fastmachinelearning/hls4ml

Your Keras model, now running on bare metal at CERN

hls4ml compiles neural networks into FPGA firmware for sub-microsecond inference.

2k stars Python Inference · Serving
hls4ml
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does

hls4ml takes trained models from Keras, PyTorch, or ONNX and spits out synthesizable C++ for FPGA high-level synthesis (HLS) tools. The goal is inference so fast it can sit inside hardware trigger systems—think filtering particle collisions before the data even leaves the detector.

The interesting bit

The project grew out of CERN’s Large Hadron Collider, where L1 triggers have brutal latency budgets. That same need for “decide now, ask questions later” has since spread to nuclear fusion feedback loops, quantum computing control systems, and satellite environmental monitoring. It’s a neat example of physics infrastructure leaking into broader engineering.

Key highlights

  • Supports Xilinx Vivado/Vitis HLS, Intel HLS, Catapult HLS, and experimental Intel oneAPI backends
  • Handles CNNs, distributed arithmetic, and binary/ternary quantized networks (each with its own citation trail)
  • pip install hls4ml gets you started; hls4ml[profiling] adds profiling tools
  • Ships with example models and a separate tutorial repo
  • Active enough to merit a 2025 overview paper and a v1.3.0 release

Caveats

  • You’ll need the vendor HLS tools installed separately; Vivado alone is a multi-gigabyte download
  • The README notes synthesis “might take several minutes”—understatement of the year for larger models
  • Intel oneAPI support is explicitly marked experimental

Verdict

Grab this if you’re building real-time control systems where a GPU is too slow or too power-hungry, and you already speak some FPGA. Skip it if you’re looking for cloud-scale batch inference or don’t have access to synthesis tooling.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.