← all repositories

axa-group/Parsr

A document parsing toolchain that extracts structured data from PDFs, images, and documents using OCR and NLP techniques.

6.2k stars JavaScript Data Tooling
Parsr
Velocity · 7d
+2.5
★ / day
Trend
steady
star history

Parsr is a document cleaning, parsing, and extraction toolchain that converts PDFs, images, and documents into structured formats including JSON, Markdown, CSV, and TXT. It performs hierarchy regeneration and detects document elements such as headings, tables, lists, and page numbers. The project was created by AXA Group but is no longer actively maintained.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.