PlayVoice/vits_chinese
A Chinese Text-to-Speech system built on VITS with BERT embeddings for natural prosody and Microsoft Natural Speech features for reduced sound errors.

Velocity · 7d
+0.7
★ / day
Trend
→steady
star history
This project implements a Text-to-Speech system based on VITS (Variational Inference for Text-to-Speech) architecture. It uses BERT to extract hidden prosody embeddings for natural grammatical pauses, incorporates inference loss techniques from Microsoft Natural Speech to reduce sound errors, and supports ONNX streaming inference for deployment.