CjangCjengh/MoeGoe
An executable for running VITS (Vocal Transformer) text-to-speech and voice conversion models.

Velocity · 7d
+1.7
★ / day
Trend
→steady
star history
MoeGoe is a command-line interface for VITS model inference, enabling text-to-speech synthesis and voice conversion. It supports multiple VITS variants including HuBERT-VITS and W2V2-VITS for dimensional emotion control. Users load a pretrained model checkpoint and config file to generate speech audio from text or convert voice characteristics between speakers.