second-state/echokit_server
Rust-based voice agent server that orchestrates speech recognition, LLM inference, and text-to-speech in a configurable pipeline.

EchoKit Server is a central component for voice AI interaction that connects an ESP32-based hardware device to AI services. It implements a complete ASR→LLM→TTS pipeline allowing developers to customize speech models, define LLM prompts, and integrate MCP servers for extended functionality. The system supports OpenAI-compatible APIs for each stage, plus end-to-end support for Google Gemini and Alibaba Qwen models, and can run locally or connect to remote inference servers.