google/gemma.cpp
A standalone C++ inference engine for running Google Gemma foundation models on CPU with SIMD optimization.

Velocity · 7d
+8.2
★ / day
Trend
→steady
star history
gemma.cpp is a lightweight, minimalist C++ implementation for inferencing Google Gemma-2, Gemma-3, and PaliGemma-2 models. It targets research and experimentation use cases rather than production deployment, providing a small ~2K LoC core with portable SIMD support via the Google Highway library. The engine is designed to be easily embeddable with minimal dependencies and modifiable for experimentation.