← all repositories
shibing624/parrots

A Chinese speech toolkit that actually ships working models

Parrots wraps ASR and TTS into pip-installable Python with pre-trained voices and emotional fine-tuning.

525 stars Python Image · Video · Audio
parrots
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does Parrots is a Python toolkit for speech recognition and synthesis with a clear focus on Chinese, English, and Japanese. It packages distilwhisper for ASR and GPT-SoVITS for TTS into one-liner initializations, plus a newer IndexTTS2 model that adds emotional control. You can pip-install it, point at a HuggingFace speaker model, and generate audio without training anything yourself.

The interesting bit The emotional control in IndexTTS2 is unusually granular. You can feed it a separate emotion reference audio, tweak an 8-dimensional emotion vector (happy, angry, sad, scared, disgusted, gloomy, surprised, calm), or let it infer mood from the text itself. There’s even pinyin mixing for precise pronunciation control in Chinese — useful when standard characters produce ambiguous readings.

Key highlights

  • One-line ASR: SpeechRecognition().recognize_speech_from_file("foo.wav") returns {"text": "..."}
  • Pre-trained speaker personas including “singing female anchor” and “game male anchor” voices
  • Streaming TTS with configurable chunk size for real-time scenarios
  • CLI entry points: parrots asr file.wav and parrots tts "text" out.wav
  • Emotion decoupled from speaker identity: same voice, different moods via emo_alpha or reference audio

Caveats

  • The README claims “high accuracy” but offers no benchmarks or comparison numbers
  • Pinyin control is explicitly noted as not supporting all possible pinyin combinations
  • Setup still requires manual PyTorch install before pip install parrots

Verdict Worth a look if you need Mandarin TTS with emotional range and don’t want to wrestle with model checkpoints yourself. Skip if you need rigorous accuracy metrics or production SLAs — this is a convenience wrapper, not a research benchmark.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.