NVIDIA/BigVGAN
A universal neural vocoder that generates high-quality audio waveforms from acoustic features for speech and music synthesis.

Velocity · 7d
+0.8
★ / day
Trend
→steady
star history
BigVGAN is a neural vocoder model published at ICLR 2023, designed to convert acoustic features such as mel-spectrograms into high-fidelity audio waveforms. It uses a GAN-based architecture with custom CUDA kernels for accelerated inference. The model supports speech synthesis, singing voice synthesis, and general audio generation, with pretrained checkpoints available via Hugging Face.