Finite-state machines that learn backward, too
k2 wires classic FSA/FST algorithms into PyTorch autograd so you can train speech recognizers with CTC, MMI, and lattice rescoring in one backward pass.

What it does k2 implements Finite State Automaton and Transducer algorithms in C++/CUDA, then exposes them to PyTorch. The target use case is speech recognition: decoding, CTC training, LF-MMI training, lattice rescoring, and confidence estimation—all differentiable, all composable in a single training graph.
The interesting bit
Instead of making every micro-operation differentiable (the PyTorch/TensorFlow way), k2 computes derivatives top-down by tracking which input arcs contributed to each output arc. That sparse, arc-level bookkeeping gets wrapped into a PyTorch Function for backward passes. The authors claim it’s more efficient and has better roundoff properties than bottom-up autograd.
Key highlights
- Core data structure is a templated
Raggedtensor—think TensorFlow’sRaggedTensor, but arrived at independently and used very differently. - Algorithms are written as C++11 lambdas operating directly on data pointers; CUDA kernels instantiate from the same templates via
cubfor reductions like exclusive-prefix-sum. - Heavy lifting is “embarrassingly parallelizable”; the authors say most code looks like normal C++.
- Python bindings via pybind11; PyTorch integration is done.
- Active recipes and Colab notebooks live in the separate icefall repo.
- A
v2.0-prebranch exists for production readiness.
Caveats
- The README admits the
Ragged-based algorithms are hard to understand without reading the code directly; the parallel structure looks nothing like CPU-native FST implementations. - No claims of Word Error Rate improvement over existing ASR tech—the pitch is generality and extensibility, not raw accuracy.
Verdict Worth a look if you’re building or researching speech recognition pipelines and need to backprop through decoding graphs. Probably overkill if you’re just fine-tuning a standard CTC model with off-the-shelf tools.