← all repositories

Rongjiehuang/ProDiff

ProDiff is a PyTorch implementation of a conditional diffusion model for high-quality text-to-speech synthesis.

432 stars Python Image · Video · Audio
ProDiff
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

ProDiff is a progressive fast diffusion model for high-quality text-to-speech synthesis. It uses a conditional diffusion probabilistic model to generate high-fidelity speech from text input. The approach aims to accelerate the typically slow diffusion sampling process to enable industrial deployment. The repository includes PyTorch implementations, pretrained models, and tutorials for speech diffusion models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.