← all repositories

lifeiteng/vall-e

An unofficial PyTorch implementation of VALL-E for zero-shot text-to-speech synthesis that can run on a single GPU.

vall-e
Velocity · 7d
+1.8
★ / day
Trend
steady
star history

This repository reproduces VALL-E, a neural codec language model that synthesizes speech from text in a zero-shot manner while preserving speaker identity. It provides training scripts and model implementation using PyTorch, with support for phonemizer and audio processing libraries. The project includes reproduced demos and leverages icefall and k2 libraries for training infrastructure.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.