mlc-ai/mlc-llm
MLC LLM is a machine learning compilation and deployment engine for running large language models natively on diverse hardware platforms.

Velocity · 7d
+20
★ / day
Trend
→steady
star history
The project compiles and optimizes LLMs using ML compilation techniques (built on TVM) to enable high-performance inference across platforms — from desktop GPUs to mobile devices, browsers, and iOS. It provides MLCEngine as a unified runtime API for loading and running quantized and optimized LLM weights with support for Vulkan, CUDA, Metal, WebGPU, WASM, and OpenCL backends.