← all repositories

vorojar/Folio-OCR

Batch OCR workbench powered by local GLM-OCR and Ollama models for document digitization.

441 stars Python Computer VisionData Tooling
Folio-OCR
Velocity · 7d
+3.6
★ / day
Trend
steady
star history

Folio-OCR is a three-panel document OCR application that uses local GLM-OCR and Ollama to recognize text from PDFs and images in batch. It performs layout detection to automatically segment pages into text regions, merges adjacent areas to reduce API calls, and exports results as Markdown, Word, EPUB, or plain text. The tool runs entirely offline with SQLite for session persistence.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.