← all repositories

lemonade-sdk/lemonade

A local AI server that runs open-source LLMs (Llama, Mistral, Qwen) on user-owned GPUs and NPUs with OpenAI API compatibility.

lemonade
Velocity · 7d
+11
★ / day
Trend
steady
star history

Lemonade is a local AI inference server that enables running LLMs entirely on the user’s own hardware. It supports a range of open-source models including Llama, Mistral, and Qwen for tasks like chat, coding, speech, and image generation. The system optimizes inference for AMD and NVIDIA GPUs as well as NPUs like Ryzen AI using engines such as ONNX Runtime and vLLM, providing OpenAI-compatible APIs so existing applications can connect without code changes.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.