ABexit/ASR-LLM-TTS
An open-source voice assistant pipeline integrating speech recognition, large language model processing, and speech synthesis.

This project builds a speech interaction system by chaining three types of AI models: automatic speech recognition (SenseVoice), large language model inference (Qwen2.5-0.5B/1.5B), and text-to-speech synthesis (CosyVoice, Edge-TTS, pyttsx3). The system takes voice input, processes it through an LLM, and generates voice output. It supports real-time interaction and is implemented in Python with dependencies on PyTorch, Transformers, and ModelScope.