← all repositories

shivammehta25/Matcha-TTS

A neural text-to-speech model that generates speech from text using conditional flow matching.

1.3k stars Jupyter Notebook Image · Video · Audio
Matcha-TTS
Velocity · 7d
+1.3
★ / day
Trend
steady
star history

Matcha-TTS is a non-autoregressive neural TTS system that uses conditional flow matching to synthesize speech from text. The model learns a probability path between noise and audio through diffusion-style training, and performs inference by solving an ODE to generate waveforms. The system is designed to be fast, probabilistic, and memory-efficient while producing natural-sounding speech, published at ICASSP 2024.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.