Is pytorch-kaldi open source?

Yes — mravanelli/pytorch-kaldi is an open-source project tracked on heatdrop.

What language is pytorch-kaldi written in?

mravanelli/pytorch-kaldi is primarily written in Python.

How popular is pytorch-kaldi?

mravanelli/pytorch-kaldi has 2.4k stars on GitHub.

Where can I find pytorch-kaldi?

mravanelli/pytorch-kaldi is on GitHub at https://github.com/mravanelli/pytorch-kaldi.

← all repositories

mravanelli/pytorch-kaldi

Marrying Kaldi’s plumbing to PyTorch’s neural nets

It exists to let researchers train PyTorch neural networks inside Kaldi’s proven speech recognition pipeline, rather than rebuilding the entire acoustic stack from scratch.

★2.4k stars Python Domain Apps Language Models

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does PyTorch-Kaldi is a hybrid toolkit for DNN/HMM speech recognition that keeps Kaldi’s feature extraction, label computation, and decoding infrastructure while handing the neural acoustic modeling to PyTorch. You can bring your own PyTorch model or choose from a menu of pre-built options—MLP, CNN, RNN, LSTM, GRU, Li-GRU, and even SincNet for raw-waveform experiments. Training supports multi-GPU setups, automatic chunking with context expansion, and resumption from the last processed chunk, and it is designed to run locally or on HPC clusters.

The interesting bit The toolkit treats the boundary between Kaldi and PyTorch as a configurable seam rather than a rigid wall. It can juggle multiple feature and label streams into complex neural architectures using plain config files, and it handles data shuffling and context expansion behind the scenes. A terse scheduler syntax—128*12 | 64*10—lets you anneal batch size, learning rate, or dropout across epochs without touching the training script.

Key highlights

Pre-implemented models include MLP, CNN, RNN, LSTM, GRU, Li-GRU, and SincNet.
Supports multi-stream inputs and combinations of neural networks for complex architectures.
Automatic chunking, context expansion, and crash recovery to the last processed chunk.
Multi-GPU training and HPC cluster compatibility.
Tutorials provided for TIMIT and Librispeech datasets.

Caveats

The authors explicitly encourage users to migrate to SpeechBrain, describing it as a “much better project” and the successor for modern speech processing.
The codebase targets older PyTorch (1.0/0.4) and CUDA (8.0–9.1) versions, with Python 2.7 still listed as tested, so it feels rooted in the late 2010s.
It is fundamentally a bridge, so you still need a full Kaldi installation and familiarity with its recipes.

Verdict Worth a look if you are maintaining or reproducing legacy DNN/HMM systems that depend on Kaldi’s decoding graph. If you are starting fresh, the authors themselves point toward SpeechBrain instead.

Frequently asked

What is mravanelli/pytorch-kaldi?: It exists to let researchers train PyTorch neural networks inside Kaldi’s proven speech recognition pipeline, rather than rebuilding the entire acoustic stack from scratch.
Is pytorch-kaldi open source?: Yes — mravanelli/pytorch-kaldi is an open-source project tracked on heatdrop.
What language is pytorch-kaldi written in?: mravanelli/pytorch-kaldi is primarily written in Python.
How popular is pytorch-kaldi?: mravanelli/pytorch-kaldi has 2.4k stars on GitHub.
Where can I find pytorch-kaldi?: mravanelli/pytorch-kaldi is on GitHub at https://github.com/mravanelli/pytorch-kaldi.