AIGC-Audio/AudioGPT
AudioGPT is a framework that uses LLMs to orchestrate multiple audio foundation models for speech synthesis, music generation, and sound understanding.

Velocity · 7d
+8.6
★ / day
Trend
→steady
star history
AudioGPT is a multi-model AI system that connects large language models with various audio processing models to enable tasks such as text-to-speech synthesis, speech recognition, style transfer, music generation, and talking head animation. The system orchestrates multiple foundation models including FastSpeech, VITS, GenerSpeech, and Whisper through a unified interface.