A compiler that turns neural-net graphs into native code
ONNX-MLIR compiles ONNX models through MLIR all the way to shared libraries, object files, or even JNI jars.

What it does ONNX-MLIR ingests an ONNX model and lowers it through MLIR dialects to LLVM IR, object files, or a shared library. It ships with a driver, an ONNX dialect reusable in other projects, and runtimes for Python, C/C++, and Java. IBM’s zDLC compiler for Telum mainframes already builds on it.
The interesting bit
The project treats a neural network as just another program to compile, not a graph to interpret. You can stop at intermediate representations—ONNX dialect, MLIR, or LLVM IR—or go all the way to a .so or .jar. The default optimization level is -O0, which is honest if inconvenient; the maintainers nudge you toward -O3.
Key highlights
- Emits ONNX dialect, MLIR, LLVM IR, object files, shared libraries, or JNI jars via
onnx-mlir --Emit*flags - Includes Python, C/C++, and Java runtimes for the compiled models
- Pinned to a specific LLVM/MLIR commit; the
clone-mlir.shscript tracks the known-good hash - Docker is the “preferred approach” because native dependency setup “may be tricky”
- Weekly open meetings, DCO-signed commits, and a public model-zoo test dashboard
Caveats
- Native builds require clang ≥ 18.1.3, protobuf ≥ 33.5, and cmake ≥ 3.26.0; the README warns this is finicky
- No topics set on the GitHub repo, which makes discovery harder than it should be
Verdict Worth a look if you need to deploy ONNX models as native binaries without dragging in a full Python runtime. Skip it if you just want to run inference in PyTorch or ONNX Runtime and don’t care about AOT compilation.