← all repositories
githubharald/CTCDecoder

Five ways to untangle neural network babble

A pedagogical Python toolkit for CTC decoding that trades speed for clarity, with a side of BK-trees and bigram language models.

CTCDecoder
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

CTCDecoder implements five algorithms for interpreting the probabilistic output of CTC-trained neural networks — the kind used in handwriting and speech recognition. You feed it a TxC matrix of softmax probabilities; it gives back a string. The package includes best path (fast, greedy), beam search (with optional character-level language model), lexicon search (dictionary-constrained via BK-tree), plus prefix search and token passing for research purposes.

The interesting bit

The value here is pedagogical, not performant. The author explicitly notes that no TensorFlow or PyTorch adapters exist — you manually extract batch elements, ensure the CTC-blank sits last, and convert to numpy. The lexicon search is a nice touch: it runs best path first, then uses a BK-tree to find dictionary words within an edit distance tolerance, scoring and returning the best match. There’s even an OpenCL best-path implementation hidden in extras/.

Key highlights

  • Five decoders in one package: best path, beam search, lexicon search, prefix search, token passing
  • Optional character-level language model via bigram statistics for beam search
  • BK-tree-based lexicon search for dictionary-constrained output
  • Installable via pip install . since 2021; test suite in tests/
  • Includes author’s own comparison paper on when to use which decoder

Caveats

  • No deep learning framework integration; manual numpy conversion required
  • Batch processing is DIY: iterate elements yourself, the decoders handle single (TxC) matrices only
  • PyTorch users must shuffle the CTC-blank from first to last position (or reconfigure PyTorch defaults)

Verdict

Useful if you’re learning CTC, prototyping recognition pipelines, or need a readable reference implementation to adapt. Skip it if you need production-speed decoding with native PyTorch/TensorFlow integration — this is teaching material with research extras, not a drop-in framework component.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.