wwbin2017/bailing
An open-source voice assistant that combines ASR, LLM, and TTS to enable natural speech-to-speech conversation with tool-calling capabilities.

百聆 (Bailing) is a voice conversation robot that mimics GPT-4o’s speech interaction capabilities. It integrates automatic speech recognition (ASR), large language models including DeepSeek R1, and text-to-speech synthesis (TTS) into a unified pipeline. The system uses OpenClaw as its core tool-calling engine to handle complex tasks, external tool orchestration, and high-level agent capabilities. It targets edge devices and low-resource environments, achieving end-to-end latency of around 800ms without requiring GPU hardware.