TensorFlow 2 speech recognition that ships to your phone
A training-to-deployment toolkit for ASR models, with a notable focus on Vietnamese and TFLite export.

What it does
TensorFlowASR bundles several well-known speech recognition architectures—Conformer, ContextNet, DeepSpeech2, Jasper, RNN Transducer—into a single TensorFlow 2 training pipeline. It handles feature extraction, augmentations, and exports finished models to TFLite for edge deployment. The project also curates corpus links for English (LibriSpeech, Common Voice) and Vietnamese (Vivos, InfoRe, VietBud500).
The interesting bit
The “almost state-of-the-art” framing is refreshingly honest, and the TFLite path is treated as a first-class citizen rather than an afterthought: the converted model becomes a direct audio-to-text function. The Vietnamese dataset curation is unusually thorough for a community ASR repo.
Key highlights
- Supports both transducer (RNNT loss) and CTC model families
- Streaming Conformer variant included for low-latency scenarios
- TFLite conversion pipeline with dedicated documentation
- Pretrained model results tracked per architecture in
examples/ - Docker Compose setup available; Apple Silicon requires Python ≥ 3.12
Caveats
- Training requires
git cloneplus manual dependency resolution (ctc_decoders,rnnt_lossfrom other authors); not a simplepip install - “What’s New?” section in README is empty, suggesting the project may not be actively maintained
- No candidate images provided for visual reference
Verdict
Worth a look if you need TensorFlow-native ASR with a clear path to mobile/edge deployment, or if you’re working on Vietnamese speech recognition. PyTorch-first researchers will find ESPnet or NeMo more ergonomic; this is for teams already committed to the TF ecosystem.