ggml-org/LlamaBarn
A macOS menu bar app that runs local LLMs via an OpenAI-compatible API server.

Velocity · 7d
+3.8
★ / day
Trend
→steady
star history
LlamaBarn is a native macOS application that serves as a local inference server for LLMs. It provides a built-in model catalog for easy installation, automatically loads models on demand and unloads them when idle, and exposes a standard OpenAI-compatible API endpoint. The app runs entirely on-device with no cloud dependency, and integrates with chat UIs, code editors, and CLI tools like Claude Code, Cline, and Continue.