← all repositories

zai-org/GLM-TTS

A text-to-speech synthesis system using large language models that supports zero-shot voice cloning and emotion control via multi-reward reinforcement learning.

GLM-TTS
Velocity · 7d
+5.5
★ / day
Trend
steady
star history

GLM-TTS is a high-quality TTS system based on large language models with a two-stage architecture: an LLM generates speech token sequences and a Flow model converts them to audio waveforms. It introduces multi-reward reinforcement learning for improved emotional expression and natural prosody control, supporting zero-shot voice cloning with 3-10 seconds of prompt audio and streaming inference.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.