Rongjiehuang/ProDiff
ProDiff is a PyTorch implementation of a conditional diffusion model for high-quality text-to-speech synthesis.

Velocity · 7d
+0.3
★ / day
Trend
→steady
star history
ProDiff is a progressive fast diffusion model for high-quality text-to-speech synthesis. It uses a conditional diffusion probabilistic model to generate high-fidelity speech from text input. The approach aims to accelerate the typically slow diffusion sampling process to enable industrial deployment. The repository includes PyTorch implementations, pretrained models, and tutorials for speech diffusion models.