sybil-solutions/vllm-studio

A local-first desktop/CLI workstation for launching, managing, and chatting with self-hosted LLM backends including VLLM, Sglang, llama.cpp, and exllamav3.

★1.1k stars TypeScript Inference · Serving LLMOps · Eval Agents

View on GitHub ↗ Homepage ↗

Velocity · 7d

+6.4

★ / day

Trend

→steady

star history

vLLM Studio provides a controller API backed by Bun/Hono, a Next.js/Electron frontend interface, and a CLI for managing model lifecycle operations, GPU metrics, logs, and OpenAI-compatible inference endpoints. It supports multiple serving backends across NVIDIA GPUs (CUDA), Apple Silicon (MLX), and GGUF recipes, while also integrating an agent runtime using pi-coding-agent for agent sessions.