0nutation/SpeechGPT
SpeechGPT is a series of research projects developing large language models capable of perceiving and generating speech audio.

Velocity · 7d
+1.3
★ / day
Trend
→steady
star history
The repository hosts multiple speech LLM research projects including SpeechGPT (cross-modal conversational LLMs), SpeechGPT-Gen (chain-of-information speech generation), SpeechAgents (multi-modal multi-agent systems), and SpeechTokenizer (unified speech tokenization). These projects explore integrating speech/audio modalities into LLM architectures for enhanced conversational and generative capabilities.