← all repositories
babysor/MockingBird

A voice cloner that speaks Mandarin, with baggage

Real-time TTS voice cloning forked for Chinese, but the repo is now in maintenance mode while its author builds a commercial successor.

36.9k stars Python Image · Video · Audio
MockingBird
Velocity · 7d
+21
★ / day
Trend
steady
star history

What it does MockingBird is a PyTorch implementation of SV2TTS (speaker verification to multispeaker text-to-speech) that lets you clone a voice from a short audio sample and generate new speech in that voice. It’s explicitly forked from CorentinJ’s Real-Time-Voice-Cloning project, with the key addition of Mandarin Chinese support through multiple datasets (aidatatang_200zh, magicdata, aishell3, data_aishell). You can run it via web server, desktop toolbox, or command line.

The interesting bit The project reuses pretrained encoder and vocoder models while requiring you to train or download a new synthesizer specifically for Chinese symbols — a pragmatic split that saves compute but adds friction. The M1 Mac setup instructions are endearingly elaborate, involving Rosetta terminals, manual C header paths, and a custom pythonM1 wrapper script.

Key highlights

  • Supports Mandarin out of the box, unlike the original English-only upstream
  • Community-shared pretrained synthesizer models available (Baidu Pan, Aliyun Drive)
  • Web server mode for remote API-style usage
  • Multiple vocoder options: WaveRNN, HiFi-GAN, and Fre-GAN
  • Tested on Tesla T4 and GTX 2060; Windows, Linux, and M1 macOS supported

Caveats

  • Author no longer actively updates the repo; development has moved to commercial product noiz.ai
  • Requirements.txt is pinned to August 2021 PyTorch versions (1.9.0, CUDA 10.2) and breaks with newer stacks
  • demo_cli is non-functional; you must obtain or train a Chinese-compatible synthesizer model
  • Several community models only work with repo tag 0.0.1

Verdict Worth a look if you specifically need open-source Mandarin voice cloning and can tolerate dated dependencies. Skip it if you want maintained code or a polished English experience — the original upstream or newer alternatives are likely smoother.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.