← all repositories

NVIDIA/BigVGAN

A universal neural vocoder that generates high-quality audio waveforms from acoustic features for speech and music synthesis.

1.2k stars Python Image · Video · Audio
BigVGAN
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

BigVGAN is a neural vocoder model published at ICLR 2023, designed to convert acoustic features such as mel-spectrograms into high-fidelity audio waveforms. It uses a GAN-based architecture with custom CUDA kernels for accelerated inference. The model supports speech synthesis, singing voice synthesis, and general audio generation, with pretrained checkpoints available via Hugging Face.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.