HuggingChat's engine just went on a diet
HuggingFace stripped their chat UI down to OpenAI-compatible APIs and added a local heuristic router that picks models based on what you're actually doing.

What it does
Chat UI is a SvelteKit chat interface that powers HuggingChat. It talks to any OpenAI-compatible API endpoint—HuggingFace’s router, llama.cpp, Ollama, OpenRouter, Poe—and now auto-discovers models from the /models endpoint instead of requiring manual configuration.
The interesting bit
The “Omni” router is the clever part: a server-side heuristic that routes requests locally without calling out to a separate selection model. Attach an image and it hits a multimodal route; enable an MCP tool and it switches to an agentic route. No cloud router service, no extra latency—just request-shape sniffing and a JSON policy file you maintain yourself.
Key highlights
- Supports any OpenAI-compatible backend via
OPENAI_BASE_URL(legacy provider integrations and GGUF discovery removed) - Embedded MongoDB for local dev; falls back to
./dbautomatically whenMONGODB_URLis unset - MCP tool calling with preconfigured servers, user-added servers, and per-model capability overrides
- Docker image
chat-ui-dbbundles MongoDB for single-container deployment - Theming via env vars;
PUBLIC_APP_ASSETSswitches betweenchatuiandhuggingchatbranding
Caveats
- No sample routes policy ships with the repo—you must write your own JSON array for the LLM router
- Legacy provider-specific integrations, embeddings, and web-search helpers are gone (available on the
legacybranch) - MCP tool routing bypasses your policy file entirely and uses a single configured model, falling back to heuristic only on misconfiguration
Verdict
Good fit if you want a polished chat UI with pluggable backends and don’t mind writing a router policy. Skip it if you need the old GGUF discovery, embeddings, or provider-native integrations—those moved to maintenance mode on the legacy branch.