metavoiceio/metavoice-src
A 1.2B parameter text-to-speech model for human-like, expressive synthesis with zero-shot voice cloning.

Velocity · 7d
+4.9
★ / day
Trend
→steady
star history
MetaVoice-1B is a foundational TTS model trained on 100K hours of speech using PyTorch. It supports emotional speech rhythm and tone in English, zero-shot voice cloning with 30-second reference audio, and cross-lingual voice cloning through fine-tuning. The model can synthesize speech of arbitrary length and is released under the Apache 2.0 license.