← all repositories

fishaudio/Bert-VITS2

Bert-VITS2 is a multilingual text-to-speech model combining VITS2 vocoder architecture with multilingual BERT embeddings for improved voice synthesis.

8.8k stars Python Image · Video · Audio
Bert-VITS2
Velocity · 7d
+8.4
★ / day
Trend
steady
star history

The project implements VITS2, an end-to-end neural vocoder for TTS, enhanced by multilingual BERT to improve prosody and pronunciation accuracy. Users preprocess training data using webui_preprocess.py, and the system supports multiple languages through BERT embeddings. The architecture builds on prior work from MassTTS and jaywalnut310/vits.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.