← all repositories

ming024/FastSpeech2

A PyTorch implementation of Microsoft's FastSpeech 2 neural text-to-speech model for generating speech audio from text.

FastSpeech2
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

This repository provides a complete implementation of Microsoft’s FastSpeech 2 architecture, a neural network-based text-to-speech system. It supports multi-speaker synthesis across multiple languages (English, Mandarin) and datasets including LibriTTS and AISHELL-3. The implementation includes training pipelines and inference scripts with support for modern neural vocoders like MelGAN and HiFi-GAN to convert mel-spectrograms to waveform audio.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.