Is omniparse open source?

Yes — adithya-s-k/omniparse is open source, released under the GPL-3.0 license.

What language is omniparse written in?

adithya-s-k/omniparse is primarily written in Python.

How popular is omniparse?

adithya-s-k/omniparse has 7.6k stars on GitHub.

Where can I find omniparse?

adithya-s-k/omniparse is on GitHub at https://github.com/adithya-s-k/omniparse.

← all repositories

adithya-s-k/omniparse

A local Swiss Army knife for turning files into LLM-ready markdown

OmniParse bundles OCR, transcription, and web crawling into one self-hosted pipeline so your RAG pipeline doesn't need a dozen SaaS subscriptions.

★7.6k stars Python Data Tooling RAG · Search

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does OmniParse is a self-hosted ingestion server that chews through documents, images, audio, video, and web pages, then spits out structured markdown. It wraps existing open-source models—Marker/Surya for PDFs, Florence-2 for image tasks, Whisper for audio, and Selenium for crawling—behind a single HTTP API. The pitch is simple: feed it a file, get back something clean enough to drop straight into a vector database.

The interesting bit The project squeezes all of this onto a single T4 GPU (about 8–10 GB VRAM) by deliberately using the smallest model variants. That’s a pragmatic trade-off: it sacrifices peak accuracy for the ability to run entirely offline without API keys or egress costs. The roadmap is even more ambitious—eventually replacing the whole model zoo with one multimodal parser.

Key highlights

Supports ~20 file types across documents, media, and dynamic web pages
Runs completely local; no external API calls
Docker and SkyPilot deployment options, plus a Gradio UI
Modular server startup: load only the document, media, or web parsers you need
Outputs structured markdown with table extraction, image captioning, and transcription

Caveats

Server is Linux-only; Windows and macOS are explicitly unsupported
Underlying Marker/Surya models carry a cc-by-nc-sa-4.0 weight license with commercial restrictions (waived only for small orgs under $5M revenue and funding)
Document parsing has known rough edges: equations don’t always convert to LaTeX, tables can misalign, and non-English text (e.g., Chinese) may struggle
Smallest model variants mean “best-in-class performance” is explicitly not the goal

Verdict Worth a look if you’re building RAG pipelines and tired of stitching together five different services, but only if you’ve got the GPU and the Linux box to host it. Teams needing production-grade OCR accuracy or Windows deployment should probably wait—or look elsewhere.

Frequently asked

What is adithya-s-k/omniparse?: OmniParse bundles OCR, transcription, and web crawling into one self-hosted pipeline so your RAG pipeline doesn't need a dozen SaaS subscriptions.
Is omniparse open source?: Yes — adithya-s-k/omniparse is open source, released under the GPL-3.0 license.
What language is omniparse written in?: adithya-s-k/omniparse is primarily written in Python.
How popular is omniparse?: adithya-s-k/omniparse has 7.6k stars on GitHub.
Where can I find omniparse?: adithya-s-k/omniparse is on GitHub at https://github.com/adithya-s-k/omniparse.