imanoop7/Ollama-OCR
OCR tool powered by Ollama-hosted vision-language models for extracting text from images and PDFs.

Velocity · 7d
+4.1
★ / day
Trend
→steady
star history
This repository provides an OCR package that uses state-of-the-art vision language models served through Ollama to extract text from images and PDFs. It supports multiple vision models including LLaVA, Llama 3.2 Vision, Granite3.2-vision, Moondream, and MiniCPM-v. The package offers multiple output formats such as Markdown, plain text, JSON, structured data, key-value pairs, and tables. It is available both as a Python library and a Streamlit web application.