← all repositories

zml/zml

A multi-hardware AI inference stack built with Zig, MLIR, and OpenXLA that targets NVIDIA, AMD, TPU, and Trainium accelerators.

3.6k stars Zig Inference · Serving
zml
Velocity · 7d
+5.7
★ / day
Trend
steady
star history

ZML is a production inference system that decouples AI workloads from proprietary hardware by compiling models directly to target accelerators with peak performance. It supports running various LLMs including Llama 3.1/3.2, Qwen 3.5, and LFM 2.5, and is built using the Zig language, MLIR compiler infrastructure, and Bazel build system.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.