← all repositories
naptha/tesseract.js

OCR in the browser without calling a single API

Tesseract.js compiles the venerable Tesseract engine to WebAssembly so you can extract text from images entirely client-side.

38.1k stars JavaScript Computer Vision
tesseract.js
Velocity · 7d
+9.5
★ / day
Trend
steady
star history

What it does Tesseract.js wraps a WebAssembly build of the Tesseract OCR engine, exposing it through a simple JavaScript worker API. Feed it an image URL, Blob, or buffer; get back recognized text in one of 100+ languages. It runs in browsers via CDN, ESM, or webpack, and on Node.js servers without touching native dependencies.

The interesting bit The project is deliberately a thin wrapper, not a fork. It does not modify Tesseract’s recognition model, add PDF support, or chase accuracy tweaks. That restraint keeps it maintainable but also means the README openly points users to Scribe.js when they need PDF parsing or model improvements — an unusual and honest bit of scope discipline.

Key highlights

  • Ships language packs on demand; v5 cut English downloads by 54% and Chinese by 73%
  • Supports real-time video recognition via worker threads
  • v6 fixed a long-standing memory leak and reduced runtime memory across the board
  • Output formats beyond plain text (like hocr, blocks) are now opt-in, not default
  • Requires Node.js 16+ for v7; API has shifted across major versions, so check migration notes

Caveats

  • No PDF support; the README is explicit that this is out of scope
  • Breaking changes are common across major versions: createWorker went async in v4, argument signatures changed in v5, and non-text outputs were disabled by default in v6
  • Accuracy is whatever upstream Tesseract provides; do not expect model improvements here

Verdict Use this when you need OCR without infrastructure — a static site, an Electron app, a serverless function. Skip it if you need PDF text extraction, production-grade accuracy tuning, or a stable API across upgrades.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.