← all repositories
xxbb1234021/speech_recognition

Mandarin speech recognition, circa 2018

A TensorFlow-based Chinese ASR project that predates the transformer era and still draws curious visitors.

851 stars Python Image · Video · Audio
speech_recognition
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does Trains a neural network to recognize Mandarin speech using the THCHS-30 corpus from Tsinghua University. You edit a config file, run train.py, and hope your GPU cooperates. Testing happens via test.py or, if you prefer, PyCharm.

The interesting bit The README is almost aggressively minimal—no architecture diagram, no accuracy numbers, no pretrained model links. Just “here is Python 3.5, here is TensorFlow 1.5, good luck.” There’s something almost archival about it now; this is deep learning before the Cambrian explosion of convenient tooling.

Key highlights

  • Targets THCHS-30, a well-known open Chinese speech corpus
  • Requires Python 3.5 and TensorFlow 1.5.0 (both long past end-of-life)
  • Configuration through a single conf.ini file
  • Includes one test screenshot showing… something working, presumably
  • 852 stars suggest it solved a real need for Mandarin-speaking developers at the time

Caveats

  • Dependencies are frozen in 2018; expect dependency archaeology to get it running
  • No word on model architecture, WER/CER metrics, or hardware requirements
  • No pretrained weights provided, so you bring your own compute and patience

Verdict Worth a look if you’re studying the evolution of Chinese ASR or need a historical baseline. Everyone else should probably start with modern toolkits like WeNet, Whisper, or Espnet.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.