OCR from scratch: a neural net that learns to read, in Node.js
A vanilla JavaScript MLP that trains itself on generated captchas or MNIST digits, then exports a standalone predictor you can require like any module.

What it does
This repo trains a plain multi-layer perceptron to recognize characters from images, then serializes the trained network to ./ocr.js — a self-contained module you require() and feed binary pixel arrays. It works with either the classic MNIST handwritten digits or synthetic training data generated from a hacked-up captcha engine.
The interesting bit
The training pipeline is the clever part: it renders glyphs to a canvas, thresholds the pixels into a flat binary array, and trains the MLP without touching TensorFlow or PyTorch. The README includes hard numbers — 95.16% on MNIST in ~22 minutes on a 2015 MacBook Pro — which gives you a honest baseline for how far a simple neural net can get.
Key highlights
- Exports a standalone
ocr.jspredictor after training; no runtime dependencies on the training stack - Supports MNIST or auto-generated synthetic datasets from configurable fonts and character sets
- Flattened 1D binary input keeps the implementation simple (no convolutions, no GPU)
- Documented benchmarks for digits, lowercase, and mixed alphanumeric sets
- Configurable via a single
config.json— hidden layer size, learning rate, image dimensions, threshold
Caveats
- Requires Cairo system dependencies for the
canvaspackage, which can be a pain to install - The alphanumeric benchmark omits the digit
4from its training text (likely a typo:012356789) - No mention of model versioning, incremental training, or handling real-world noise beyond the synthetic generator
Verdict
Worth a look if you want to understand MLPs without framework magic, or need a tiny, embeddable OCR predictor. Skip it if you need production-grade accuracy on messy scans — this is a teaching tool and weekend-project baseline, not a replacement for Tesseract or modern CNNs.