School project that grew up to read your bad handwriting
A Czech student's Intel ISEF entry turned into a full pipeline for turning photos of handwritten pages into text, with a notable tolerance for messy reality.

What it does Takes a photo of a handwritten page and runs it through four stages: find the page and strip the background, locate and cut out individual words, normalize those word images, then split and recognize characters. The whole flow lives in Jupyter notebooks, with TensorFlow and OpenCV doing the heavy lifting. It handles Czech text specifically, not just English.
The interesting bit The project treats OCR as a staged computer-vision pipeline rather than throwing an end-to-end neural network at the whole page. That modular approach means you can inspect (and presumably debug) what went wrong at each step—useful when your training data is your own school notes.
Key highlights
- Four-step pipeline: page detection → word segmentation → normalization → character recognition
- Built for Czech language support, not just English
- Started as a school project, presented at Intel ISEF 2018
- Main entry points:
OCR.ipynbandOCR-Evaluator.ipynb - Requires manual download of datasets and models after cloning
- TensorFlow 1.4 and Python 3.6 (yes, that era)
Caveats
- Dependencies are pinned to 2017-era versions (TensorFlow 1.4, NumPy 1.13, OpenCV 3.1); expect friction getting this running on modern systems
- README is vague on model accuracy, dataset size, and whether the character recognition step uses a CNN, RNN, or something else—check the notebooks
- No automated tests or CI; this is research/experiment code
Verdict Worth a look if you’re building a custom OCR pipeline and want to see how classical CV segmentation pairs with ML recognition, or if you need Czech handwriting support. Skip it if you want a production-ready API or modern TensorFlow 2.x code.