lucidrains/voicebox-pytorch
A PyTorch implementation of Voicebox, MetaAI's state-of-the-art text-to-speech generative model.

Velocity · 7d
+0.7
★ / day
Trend
→steady
star history
This repository provides a PyTorch implementation of Voicebox, a text-to-speech network developed by MetaAI. It employs rotary embeddings rather than ALiBi for bidirectional modeling and uses adaptive normalization techniques. The implementation is based on conditional flow matching for training the generative audio model.