← all repositories
pannous/tensorflow-ocr

OCR that watches your cursor, slowly

A TensorFlow attention-based text recognizer with a quirky party trick: point your mouse at text, wait ten seconds for the model to wake up, then watch it read.

644 stars Python Computer VisionML Frameworks
tensorflow-ocr
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

Trains a neural network to recognize text using TensorFlow’s attention mechanism. Ships with mouse_prediction.py, which reads whatever text sits under your cursor — after a leisurely 10-second model load. Also bundles text_recognizer.py and a forked EAST detector for finding text boxes in real-world images.

The interesting bit

The training pipeline auto-generates synthetic data: it renders every font on your machine in distorted shapes, so you can bootstrap a letter-recognizer without hunting down labeled datasets. Think MNIST, but for your entire font library.

Key highlights

  • Attention-based sequence recognition built in raw TensorFlow
  • train_letters.py generates synthetic training data from local fonts automatically
  • mouse_prediction.py does live screen OCR via mouse position
  • Forks EAST for text detection in natural images
  • “Batteries included” — though the README doesn’t specify which batteries

Caveats

  • 10-second cold-start for inference; no mention of GPU requirements or model size
  • No accuracy numbers, benchmark comparisons, or dataset details provided
  • README is sparse: no example outputs, no architecture diagram, no training time estimates

Verdict

Worth a look if you’re building an OCR pipeline from scratch and want to see attention mechanics without framework abstraction. Skip it if you need production latency or documented accuracy — this is clearly research-grade tooling with rough edges.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.