← all repositories
xboot/libonnx

ONNX on a diet: neural nets for devices that hate bloat

A drop-in C99 inference engine that runs on bare metal and still finds time for hardware acceleration.

649 stars C Inference · Serving
libonnx
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

Libonnx is a single-file-style C99 library that loads and runs ONNX models without dragging in a runtime the size of a operating system. You allocate a context, point it at a model file (or a baked-in byte array), wire up tensors by name, and call onnx_run(). It targets embedded systems where “portable” usually means “compiles without a C++ compiler or a 500 MB dependency tree.”

The interesting bit

The hardware acceleration hook is the unusual part for something this small. You pass an array of struct resolver_t * at init, and the engine delegates what it can. The rest runs on a pure C implementation. The examples even include a hand-drawn digit recognizer with SDL2 graphics, which is either charming or slightly surreal for a library that could run on a microcontroller.

Key highlights

  • Drop-in build: .c and .h files compile with your project, no package manager required
  • Supports ONNX opset 24 (version 1.17.0) — the supported operator table is documented
  • Cross-compilation works via CROSS_COMPILE= prefix; tested example given for arm64
  • Models can be embedded as C arrays via xxd -i, avoiding filesystem dependencies entirely
  • Tests pass on MNIST, MobileNet v2, ShuffleNet, SqueezeNet, super-resolution, and Tiny YOLO v2

Caveats

  • Not all ONNX operators are implemented; the README warns that tests outside tests/model may fail
  • The “hardware acceleration support” is a pluggable resolver interface — actual backends are your problem to supply
  • SDL2 dependency for the MNIST example is optional but easy to miss in the build instructions

Verdict

Worth a look if you’re shipping ML to a device where Python is a fantasy and megabytes matter. Skip it if you need the full ONNX operator zoo out of the box or already have a comfortable TensorFlow Lite setup.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.