← all repositories

kan-bayashi/ParallelWaveGAN

PyTorch implementation of neural vocoder models (Parallel WaveGAN, MelGAN, HiFi-GAN, StyleMelGAN) for real-time text-to-speech synthesis.

1.6k stars Jupyter Notebook Image · Video · Audio
ParallelWaveGAN
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

This repository provides unofficial PyTorch implementations of several state-of-the-art non-autoregressive neural vocoder models for converting mel-spectrograms to audio waveforms. The models include Parallel WaveGAN, MelGAN, Multi-band MelGAN, HiFi-GAN, and StyleMelGAN. These vocoders are designed to work with TTS systems like ESPnet-TTS and can generate high-quality speech in real time when combined with a mel-spectrogram predictor.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.