hybridgroup/yzma
A Go library that wraps llama.cpp to run language and vision-language models locally with GPU acceleration.

Velocity · 7d
+1.9
★ / day
Trend
→steady
star history
yzma is a Go binding for llama.cpp that enables fully local LLM inference without CGo dependencies. It supports running various sizes of language and vision-language models on Linux, macOS, and Windows by leveraging hardware acceleration like CUDA, Metal, Vulkan, and ROCm. The library uses purego and FFI to call into llama.cpp’s C library.