← all repositories
pannous/tensorflow-speech-recognition

A fossil record of how we used to teach machines to listen

An early TensorFlow seq2seq speech project that its own authors now point to as a historical artifact.

tensorflow-speech-recognition
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does This repo houses Python experiments in speech-to-text using TensorFlow’s now-deprecated seq2seq APIs. It includes toy classifiers for numbers and speakers, plus denser architectures, all wired up to spectrograms and live audio via pyaudio. The stated goal was a standalone Linux speech recognizer built on plentiful public training data.

The interesting bit The README is unusually honest: the authors have twice updated it to tell you to go elsewhere—first to Mozilla DeepSpeech in 2020, then to OpenAI’s Whisper in 2024. That makes it a rare self-annotated graveyard, useful for tracing how quickly the SOTA treadmill can obsolete a project.

Key highlights

  • Built on TensorFlow 1.0 seq2seq, now incompatible with current releases
  • Includes toy examples (number_classifier_tflearn.py, speaker_classifier_tflearn.py) and a densenet variant
  • Ships with spectrogram visualizations and live recording via record.py
  • Proposes extensions that now look prescient: GPU WarpCTC, modular graphs, and “P2P learning” snapshots
  • Explicitly maintained “only for educational purposes” since 2020

Caveats

  • The installation instructions require building portaudio from source and hand-tweaking LD_LIBRARY_PATH; the README even misspells LIBRARY_PATH as LIDRARY_PATH
  • Dependencies (layer, tensorpeers) are separate repos by the same author, so the project is somewhat glued together
  • No training scripts beyond train.sh with no documented contents; getting to a working model is left as an exercise

Verdict Worth a quick browse if you’re writing a history-of-STT talk or want to see how seq2seq was wielded in 2016. Anyone building something today should follow the authors’ own advice and use Whisper instead.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.