← all repositories

PlayVoice/whisper-vits-svc

A deep learning model for end-to-end singing voice conversion using VITS (Variational Inference with adversarial learning).

2.9k stars Python Image · Video · Audio
whisper-vits-svc
Velocity · 7d
+2.1
★ / day
Trend
steady
star history

This project implements a variational inference model with adversarial learning for singing voice conversion based on the VITS architecture. It enables converting one singer’s voice to another speaker’s voice, supports multiple speakers, speaker mixing, and basic F0 editing. The model requires a minimum of 6GB VRAM for training and can even handle audio with light accompaniment.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.