← all repositories

ggml-org/LlamaBarn

A macOS menu bar app that runs local LLMs via an OpenAI-compatible API server.

LlamaBarn
Velocity · 7d
+3.8
★ / day
Trend
steady
star history

LlamaBarn is a native macOS application that serves as a local inference server for LLMs. It provides a built-in model catalog for easy installation, automatically loads models on demand and unloads them when idle, and exposes a standard OpenAI-compatible API endpoint. The app runs entirely on-device with no cloud dependency, and integrates with chat UIs, code editors, and CLI tools like Claude Code, Cline, and Continue.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.