varunvasudeva1/llm-server-docs
End-to-end documentation and scripts for deploying a fully private, local LLM server on Debian with chat, web search, RAG, model management, and MCP capabilities.

This repository provides step-by-step documentation for self-hosting an LLM server on Debian, covering Docker containerization, inference engine selection, and integration of complementary services. It includes configuration for multiple inference backends (Ollama, llama.cpp, vLLM), a chat interface (Open WebUI), a RAG pipeline for document retrieval, MCP proxy and MCPJungle for extending model capabilities, ComfyUI for image generation, and Kokoro for text-to-speech. The guide also covers system hardening, networking, and secure remote access via Tailscale.