Soul-AILab/SoulX-Podcast
A text-to-speech model for generating realistic long-form, multi-speaker podcast dialogues with support for Mandarin, English, and several Chinese dialects.

Velocity · 7d
+13
★ / day
Trend
→steady
star history
SoulX-Podcast is an inference codebase for generating high-fidelity podcasts from text. It enables podcast-style multi-turn, multi-speaker dialogic speech generation with paralinguistic controls. The system supports Mandarin and English as well as Chinese dialects including Sichuanese, Henanese, and Cantonese, providing personalized podcast-style speech synthesis with dialectal and paralinguistic diversity.