Is llama-gpt open source?

Yes — getumbrel/llama-gpt is open source, released under the MIT license.

What language is llama-gpt written in?

getumbrel/llama-gpt is primarily written in TypeScript.

How popular is llama-gpt?

getumbrel/llama-gpt has 10.9k stars on GitHub.

Where can I find llama-gpt?

getumbrel/llama-gpt is on GitHub at https://github.com/getumbrel/llama-gpt.

← all repositories

getumbrel/llama-gpt

ChatGPT in a Docker container, minus the cloud and the NDAs

A one-click, offline Llama 2 chatbot that keeps your prompts on your hardware.

★10.9k stars TypeScript Chat Assistants Inference · Serving

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does LlamaGPT wraps Llama 2 and Code Llama models in a familiar ChatGPT-style web UI, served locally via Docker. It includes an OpenAI-compatible API at localhost:3001, so existing tools can point at your basement server instead of someone else’s GPU farm. The project is maintained by Umbrel, makers of a home-server OS, and it shows: deployment targets include umbrelOS, M1/M2 Macs, generic x86/arm64 Docker hosts, and Kubernetes.

The interesting bit The real work here is packaging, not model training. LlamaGPT glues together McKay Wrigley’s Chatbot UI, Georgi Gerganov’s llama.cpp, and Andrei’s Python bindings, then adds automated model downloads and hardware-specific run scripts. The benchmarks are unusually honest: a Raspberry Pi 4 manages 0.9 tokens/sec on the 7B model, while an M1 Max hits 54 tokens/sec. You know exactly what you’re getting into.

Key highlights

Ships quantized models from 7B to 70B (and Code Llama variants), with memory requirements clearly listed
CUDA support for Nvidia GPUs; Metal support for Apple Silicon
Kubernetes manifests included for cluster deployments
OpenAI-compatible API with auto-generated docs at /docs
One-click install via umbrelOS App Store

Caveats

Custom models and runtime model switching are on the roadmap but not yet implemented
First launch downloads multi-gigabyte models and may appear hung for several minutes
Benchmarks only cover M1 Max MacBook Pro for Code Llama models; other hardware is untested

Verdict Good fit for privacy-paranoid developers, homelabbers, or anyone whose internet is unreliable. Skip it if you need model flexibility today or if your hardware is closer to the Pi 4 than the M1 Max.

Frequently asked

What is getumbrel/llama-gpt?: A one-click, offline Llama 2 chatbot that keeps your prompts on your hardware.
Is llama-gpt open source?: Yes — getumbrel/llama-gpt is open source, released under the MIT license.
What language is llama-gpt written in?: getumbrel/llama-gpt is primarily written in TypeScript.
How popular is llama-gpt?: getumbrel/llama-gpt has 10.9k stars on GitHub.
Where can I find llama-gpt?: getumbrel/llama-gpt is on GitHub at https://github.com/getumbrel/llama-gpt.