← all repositories
onnx/onnx-mlir

A compiler that turns neural-net graphs into native code

ONNX-MLIR compiles ONNX models through MLIR all the way to shared libraries, object files, or even JNI jars.

onnx-mlir
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does ONNX-MLIR ingests an ONNX model and lowers it through MLIR dialects to LLVM IR, object files, or a shared library. It ships with a driver, an ONNX dialect reusable in other projects, and runtimes for Python, C/C++, and Java. IBM’s zDLC compiler for Telum mainframes already builds on it.

The interesting bit The project treats a neural network as just another program to compile, not a graph to interpret. You can stop at intermediate representations—ONNX dialect, MLIR, or LLVM IR—or go all the way to a .so or .jar. The default optimization level is -O0, which is honest if inconvenient; the maintainers nudge you toward -O3.

Key highlights

  • Emits ONNX dialect, MLIR, LLVM IR, object files, shared libraries, or JNI jars via onnx-mlir --Emit* flags
  • Includes Python, C/C++, and Java runtimes for the compiled models
  • Pinned to a specific LLVM/MLIR commit; the clone-mlir.sh script tracks the known-good hash
  • Docker is the “preferred approach” because native dependency setup “may be tricky”
  • Weekly open meetings, DCO-signed commits, and a public model-zoo test dashboard

Caveats

  • Native builds require clang ≥ 18.1.3, protobuf ≥ 33.5, and cmake ≥ 3.26.0; the README warns this is finicky
  • No topics set on the GitHub repo, which makes discovery harder than it should be

Verdict Worth a look if you need to deploy ONNX models as native binaries without dragging in a full Python runtime. Skip it if you just want to run inference in PyTorch or ONNX Runtime and don’t care about AOT compilation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.