TensorFlow 1.x OCR with attention, still waiting for its sequel
A packaged-up attention OCR model that trains on GCP but hasn't made the jump to TensorFlow 2.

What it does
attention-ocr is a Python package (aocr) that trains a visual attention-based OCR model: sliding CNN → LSTM → attention decoder. It handles the full pipeline from TFRecords dataset creation to training, testing with attention visualization, and exporting as SavedModel or frozen graph for TensorFlow Serving.
The interesting bit
The project packages a research model into something deployable. The CLI wraps dataset building, training, and export; the Google Cloud ML Engine integration means you can spin up GPU training jobs without writing your own plumbing. The attention visualization during testing is a nice diagnostic — you can see where the model is looking for each character.
Key highlights
- Installable via
pip install aocrwith CLI for dataset, train, test, and export commands - Exports to SavedModel (default) or frozen graph for serving
- Includes TensorFlow Serving REST API setup with base64-encoded image input
- Google Cloud ML Engine training job support documented with
gcloudexamples - Attention map visualization during testing, saved to
out/by default
Caveats
- Stuck on TensorFlow 1.x; TF2 upgrade is “planned” but not done, and the README invites PRs
- Training “takes quite a long time to reach convergence” since CNN and attention train simultaneously
- Export requires manually moving files into a version-numbered subdirectory for TensorFlow Serving
Verdict
Worth a look if you need an attention-based OCR you can train on GCP and serve via TensorFlow Serving — but only if you’re willing to work in the TensorFlow 1.x era. Everyone else should probably wait for that TF2 migration or look elsewhere.