Is cnn_lstm_ctc_ocr open source?

Yes — weinman/cnn_lstm_ctc_ocr is open source, released under the GPL-3.0 license.

What language is cnn_lstm_ctc_ocr written in?

weinman/cnn_lstm_ctc_ocr is primarily written in Python.

How popular is cnn_lstm_ctc_ocr?

weinman/cnn_lstm_ctc_ocr has 502 stars on GitHub.

Where can I find cnn_lstm_ctc_ocr?

weinman/cnn_lstm_ctc_ocr is on GitHub at https://github.com/weinman/cnn_lstm_ctc_ocr.

← all repositories

weinman/cnn_lstm_ctc_ocr

A tighter CRNN that beats the original on its own turf

A TensorFlow 1.x reimplementation of the classic CRNN text-recognition architecture that trims 15% of parameters while nudging word error rate lower on standard synthetic data.

★502 stars Python Computer Vision

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Trains a convolutional-recurrent neural network to read words in images end-to-end, no character segmentation required. The model feeds CNN features into a stacked bidirectional LSTM and learns with CTC loss, the standard recipe for scene-text OCR. It ships with scripts to download the MJSynth synthetic dataset, pack it into TensorFlow records, and train via a Makefile-driven pipeline.

The interesting bit

The architecture is deliberately not a revolution. It is a careful refactor of Shi et al.’s CRNN: paired 3×3 convolutions replace single layers early on, the final expensive 2×2×512 conv is dropped in favor of vertical max-pooling, and horizontal downsampling is restrained so narrow fonts survive. Batch normalization is added after every conv pair. The payoff is 15% fewer convolutional parameters and a 1.82% word error rate on case-insensitive closed-vocabulary MJSynth—edging below the original CRNN’s reported numbers.

Key highlights

TensorFlow 1.x implementation using tf.data and custom Estimator APIs for I/O and training
Supports open, closed, and mixed vocabulary decoding; optional lexicon-constrained beam search via a forked CTCWordBeamSearch module
Dynamic training data generation supported through MapTextSynthesizer for domain-specific augmentation
Pre-trained checkpoints published via DOI for reproducibility of ICDAR 2019 historical-map recognition results
Validation script runs interactively: pipe image paths to validate.py and read decoded text from stdout

Caveats

Python 2.7 only; TensorFlow ≥1.10 with deprecation warnings for newer versions—this is legacy-stack code
Model parameters are hardcoded in src/model.py, not exposed as command-line flags
Full MJSynth download takes 4–12 hours; the included 0.1% demo set is only useful for a quick smoke test

Verdict

Worth a look if you need a well-documented, reproducible CRNN baseline in TensorFlow 1.x or want to study architectural tweaks that trade a little compute for cleaner convergence. Skip it if you are already committed to PyTorch, TensorFlow 2.x, or modern transformer-based OCR; this is a research artifact, not a maintained product.

Frequently asked

What is weinman/cnn_lstm_ctc_ocr?: A TensorFlow 1.x reimplementation of the classic CRNN text-recognition architecture that trims 15% of parameters while nudging word error rate lower on standard synthetic data.
Is cnn_lstm_ctc_ocr open source?: Yes — weinman/cnn_lstm_ctc_ocr is open source, released under the GPL-3.0 license.
What language is cnn_lstm_ctc_ocr written in?: weinman/cnn_lstm_ctc_ocr is primarily written in Python.
How popular is cnn_lstm_ctc_ocr?: weinman/cnn_lstm_ctc_ocr has 502 stars on GitHub.
Where can I find cnn_lstm_ctc_ocr?: weinman/cnn_lstm_ctc_ocr is on GitHub at https://github.com/weinman/cnn_lstm_ctc_ocr.