freewym/espresso
A modular end-to-end neural speech recognition toolkit built on PyTorch and fairseq.

Velocity · 7d
+0.3
★ / day
Trend
→steady
star history
Espresso is an open-source ASR toolkit providing state-of-the-art training recipes for speech datasets including WSJ, LibriSpeech, and Switchboard. It supports distributed training across GPUs and nodes, and implements various decoding approaches including CTC decoding, Transducer models, and Conformer encoders. The toolkit features on-the-fly feature extraction from raw waveforms and supports word-based language model fusion with a parallelized decoder.