sybil-solutions/vllm-studio
A local-first desktop/CLI workstation for launching, managing, and chatting with self-hosted LLM backends including VLLM, Sglang, llama.cpp, and exllamav3.

Velocity · 7d
+6.4
★ / day
Trend
→steady
star history
vLLM Studio provides a controller API backed by Bun/Hono, a Next.js/Electron frontend interface, and a CLI for managing model lifecycle operations, GPU metrics, logs, and OpenAI-compatible inference endpoints. It supports multiple serving backends across NVIDIA GPUs (CUDA), Apple Silicon (MLX), and GGUF recipes, while also integrating an agent runtime using pi-coding-agent for agent sessions.