r9y9/wavenet_vocoder
A PyTorch implementation of WaveNet vocoder for generating high-quality raw speech samples from linguistic or acoustic features.

This repository provides a neural vocoder implementation based on the WaveNet architecture. It generates raw audio waveforms conditioned on linguistic or acoustic features, supporting mixture of logistics, mixture of Gaussians, and single Gaussian distributions for 16-bit audio modeling. The vocoder is used as a component in text-to-speech systems to convert acoustic representations into speech, and integrates with the ESPnet speech processing toolkit.