Is LiteRT-LM open source?

Yes — google-ai-edge/LiteRT-LM is open source, released under the Apache-2.0 license.

What language is LiteRT-LM written in?

google-ai-edge/LiteRT-LM is primarily written in C++.

How popular is LiteRT-LM?

google-ai-edge/LiteRT-LM has 6k stars on GitHub and is currently holding steady.

Where can I find LiteRT-LM?

google-ai-edge/LiteRT-LM is on GitHub at https://github.com/google-ai-edge/LiteRT-LM.

← all repositories

google-ai-edge/LiteRT-LM

Google's LLM runtime that actually ships in Chrome and watches

A C++ inference engine built to run Gemma, Llama, and friends on everything from Raspberry Pi to Pixel Watch—because the cloud is sometimes just too far away.

★6k stars C++ Inference · Serving Language Models Agents

View on GitHub ↗ Homepage ↗

Velocity · 7d

+13

★ / day

Trend

→steady

star history

What it does

LiteRT-LM is Google’s inference framework for squeezing large language models onto edge hardware. It handles model execution across CPU, GPU, and NPU backends, with bindings for Python, Kotlin, C++, Swift (early preview), and even browser JavaScript. The project also includes a CLI tool for running models directly without writing code.

The interesting bit

This isn’t experimental—it’s already running inside Chrome, Chromebook Plus, and Pixel Watch. The framework supports multi-token prediction (MTP) for speculative decoding, which Google claims makes Gemma 4 inference up to 3× faster. It also handles vision and audio inputs, plus function calling for agentic workflows, all without leaving the device.

Key highlights

Ships in actual Google products, not just demo apps
Cross-platform: Android, iOS, web, desktop, and IoT including Raspberry Pi
Hardware acceleration via GPU and NPU across Linux, macOS, and Windows
Multi-modality support (vision, audio) and tool use / function calling
Broad model compatibility: Gemma, Llama, Phi-4, Qwen, and others
Community Flutter package available via flutter_gemma

Caveats

Swift and JavaScript APIs are marked “early preview,” not stable
Flutter support is community-maintained, not first-party
The README is heavy on feature lists and light on technical architecture details

Verdict

Mobile and embedded developers who need on-device LLM inference without reinventing the wheel should look here. If you’re running everything in the cloud anyway, or need training capabilities, this is the wrong layer of the stack.

Frequently asked

What is google-ai-edge/LiteRT-LM?: A C++ inference engine built to run Gemma, Llama, and friends on everything from Raspberry Pi to Pixel Watch—because the cloud is sometimes just too far away.
Is LiteRT-LM open source?: Yes — google-ai-edge/LiteRT-LM is open source, released under the Apache-2.0 license.
What language is LiteRT-LM written in?: google-ai-edge/LiteRT-LM is primarily written in C++.
How popular is LiteRT-LM?: google-ai-edge/LiteRT-LM has 6k stars on GitHub and is currently holding steady.
Where can I find LiteRT-LM?: google-ai-edge/LiteRT-LM is on GitHub at https://github.com/google-ai-edge/LiteRT-LM.