← all repositories
datalab-to/chandra

An OCR model that actually reads your handwriting and tables

Chandra OCR 2 turns scanned chaos into structured Markdown, HTML, or JSON without destroying the layout.

11.1k stars Python Computer Vision
chandra
Velocity · 7d
+46
★ / day
Trend
steady
star history

What it does

Chandra OCR 2 ingests images and PDFs, then spits out structured text in Markdown, HTML, or JSON while preserving layout details like tables, forms, math notation, and even checkbox states. It supports 90+ languages and offers two inference paths: a local HuggingFace backend or a remote vLLM server via Docker.

The interesting bit

The model was benchmarked against olmOCR and topped it, plus the authors built their own multilingual benchmark because, as they note, “there isn’t a good public multilingual OCR benchmark.” That’s either admirable thoroughness or a telling gap in the field—probably both.

Key highlights

  • Outputs Markdown, HTML, and JSON with layout metadata in one pass
  • Handles handwriting, math, tables, forms, and image extraction
  • 90+ language support with published benchmarks for 43 common ones
  • CLI tool plus optional Streamlit app for interactive single-page processing
  • vLLM server mode for batch production workloads

Caveats

  • The open weights use a modified OpenRAIL-M license: free for research, personal use, and startups under $2M revenue, but competitive use against the Datalab API is prohibited
  • The managed Datalab API claims higher accuracy than the open weights, so self-hosters aren’t getting the best version
  • Benchmarks include “own benchmarks” alongside external ones; the olmOCR table shows Chandra 2 leading, but olmOCR still wins on headers/footers

Verdict

Worth a look if you’re building document pipelines and need layout-aware extraction without calling a cloud API. Skip it if your use case is commercial at scale and you don’t want to negotiate a license—or if you just need plain text and don’t care about table structure.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.