← all repositories

jik876/hifi-gan

A GAN-based deep learning model for efficient and high-fidelity text-to-speech synthesis.

2.4k stars Python Image · Video · Audio
hifi-gan
Velocity · 7d
+1.1
★ / day
Trend
steady
star history

HiFi-GAN is a generative adversarial network architecture for speech synthesis that achieves high fidelity audio generation by modeling periodic patterns in audio signals. The model generates 22.05 kHz audio at 167.9x real-time on a single V100 GPU, with a CPU-optimized variant achieving 13.4x real-time performance. It supports mel-spectrogram inversion for arbitrary speakers and can be used as an end-to-end vocoder in larger text-to-speech pipelines.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.