← all repositories

gemelo-ai/vocos

A neural vocoder that synthesizes high-quality audio waveforms from mel-spectrograms or EnCodec tokens using a GAN-based approach.

1.1k stars Python Image · Video · Audio
vocos
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

Vocos is a fast neural vocoder that generates audio waveforms from acoustic features in a single forward pass. Unlike typical time-domain GAN vocoders, it generates spectral coefficients which are rapidly converted to audio via inverse Fourier transform. It supports inference from mel-spectrograms and EnCodec quantization tokens, with pretrained models available at 24kHz.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.