fluxions-ai/vui
A real-time voice assistant server that streams audio through WebRTC, transcribes with Whisper, runs a local LLM, and synthesizes speech with Vui Nano TTS.

Vui is a streaming conversational voice assistant that runs entirely locally on a single Python server. It chains faster-whisper for speech recognition, a local LLM (Qwen-based), and Vui Nano, a 300M-parameter speech transformer for TTS. The system supports WebSocket-based streaming, VAD-driven turn detection, barge-in to interrupt mid-reply, and is compatible with the OpenAI Realtime API for drop-in replacement in existing clients.