← all repositories
huggingface/chat-ui

HuggingChat's engine just went on a diet

HuggingFace stripped their chat UI down to OpenAI-compatible APIs and added a local heuristic router that picks models based on what you're actually doing.

10.8k stars TypeScript Chat Assistants
chat-ui
Velocity · 7d
+8.9
★ / day
Trend
steady
star history

What it does

Chat UI is a SvelteKit chat interface that powers HuggingChat. It talks to any OpenAI-compatible API endpoint—HuggingFace’s router, llama.cpp, Ollama, OpenRouter, Poe—and now auto-discovers models from the /models endpoint instead of requiring manual configuration.

The interesting bit

The “Omni” router is the clever part: a server-side heuristic that routes requests locally without calling out to a separate selection model. Attach an image and it hits a multimodal route; enable an MCP tool and it switches to an agentic route. No cloud router service, no extra latency—just request-shape sniffing and a JSON policy file you maintain yourself.

Key highlights

  • Supports any OpenAI-compatible backend via OPENAI_BASE_URL (legacy provider integrations and GGUF discovery removed)
  • Embedded MongoDB for local dev; falls back to ./db automatically when MONGODB_URL is unset
  • MCP tool calling with preconfigured servers, user-added servers, and per-model capability overrides
  • Docker image chat-ui-db bundles MongoDB for single-container deployment
  • Theming via env vars; PUBLIC_APP_ASSETS switches between chatui and huggingchat branding

Caveats

  • No sample routes policy ships with the repo—you must write your own JSON array for the LLM router
  • Legacy provider-specific integrations, embeddings, and web-search helpers are gone (available on the legacy branch)
  • MCP tool routing bypasses your policy file entirely and uses a single configured model, falling back to heuristic only on misconfiguration

Verdict

Good fit if you want a polished chat UI with pluggable backends and don’t mind writing a router policy. Skip it if you need the old GGUF discovery, embeddings, or provider-native integrations—those moved to maintenance mode on the legacy branch.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.