Is autovc open source?

Yes — auspicious3000/autovc is open source, released under the MIT license.

What language is autovc written in?

auspicious3000/autovc is primarily written in Python.

How popular is autovc?

auspicious3000/autovc has 1.1k stars on GitHub.

Where can I find autovc?

auspicious3000/autovc is on GitHub at https://github.com/auspicious3000/autovc.

← all repositories

auspicious3000/autovc

Voice cloning without the adversarial drama

A 2019 voice conversion system that learned to swap speakers using only autoencoder loss—no GANs, no parallel data, no fuss.

★1.1k stars Python Image · Video · Audio

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

AutoVC converts speech from one speaker to another without needing recordings of the same sentence from both people. Feed it a mel-spectrogram and a target speaker embedding, and it reshapes the voice while preserving the words and prosody. The repo includes pre-trained models, Jupyter notebooks for conversion and vocoding, and a tiny verification dataset.

The interesting bit

The zero-shot claim is the hook: the model supposedly generalizes to speakers it never heard during training, using only an autoencoder loss rather than adversarial training. The authors also ship a HiFi-GAN alternative to the original WaveNet vocoder, which saves you from the molasses-slow neural vocoding era.

Key highlights

PyTorch implementation with pre-trained weights for the converter, speaker encoder, and vocoder
Supports both GE2E embeddings (zero-shot) and one-hot vectors (closed speaker set)
Includes HiFi-GAN v1 weights for faster waveform generation
Training converges at reconstruction loss ~0.0001, per the README
Paper accepted at ICML 2019; audio demo available

Caveats

The bundled wav data is “very small” and explicitly for code verification only—you bring your own dataset
Training/testing metadata formats differ, which is a footgun waiting to happen
Dependencies include PyTorch ≥0.4.1 and TensorFlow ≥1.3 (the latter only for TensorBoard, but still)

Verdict

Worth a look if you’re researching voice conversion or need a baseline that predates the diffusion/vocoder-heavy modern stack. Skip it if you want turnkey voice cloning for production; this is research code with 2019 ergonomics.

Frequently asked

What is auspicious3000/autovc?: A 2019 voice conversion system that learned to swap speakers using only autoencoder loss—no GANs, no parallel data, no fuss.
Is autovc open source?: Yes — auspicious3000/autovc is open source, released under the MIT license.
What language is autovc written in?: auspicious3000/autovc is primarily written in Python.
How popular is autovc?: auspicious3000/autovc has 1.1k stars on GitHub.
Where can I find autovc?: auspicious3000/autovc is on GitHub at https://github.com/auspicious3000/autovc.