Is agentic-doc open source?

Yes — landing-ai/agentic-doc is open source, released under the Apache-2.0 license.

What language is agentic-doc written in?

landing-ai/agentic-doc is primarily written in Python.

How popular is agentic-doc?

landing-ai/agentic-doc has 2.4k stars on GitHub.

Where can I find agentic-doc?

landing-ai/agentic-doc is on GitHub at https://github.com/landing-ai/agentic-doc.

← all repositories

landing-ai/agentic-doc

This document parser is officially a ghost

LandingAI's Python wrapper for Agentic Document Extraction has been deprecated in favor of a new library, but the repo still holds 2,400 stars and a useful pattern for API client design.

★2.4k stars Python Data Tooling Domain Apps

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

agentic-doc is a Python client for LandingAI’s Agentic Document Extraction API. It turns visually complex documents—PDFs, images, URLs—into structured JSON and Markdown, handling the messy parts like splitting 1,000-page PDFs into parallel chunks, retrying on rate limits, and stitching results back together.

The interesting bit

The library treats “just call the API” as harder than it sounds. It auto-splits large documents against page limits, manages thread pools and exponential backoff for 408/429/502-504 errors, and even generates bounding-box visualizations so you can verify the model actually looked where it claims. That’s the kind of boring reliability that separates a demo from production code.

Key highlights

Single parse() function handles files, URLs, raw bytes, or connector configs (S3, Google Drive, local directories)
Pydantic models for typed field extraction with per-field confidence scores
Configurable parallelism and retries via environment variables or .env files—no code changes needed
Visual debugging tools: save grounding snippets as PNGs or generate full annotated page images
Still actively maintained enough to have CI badges, though officially legacy

Caveats

Deprecated: README opens with a deprecation warning pointing to landingai-ade for new projects
Requires LandingAI API key; not a self-hosted or offline solution
Python 3.9–3.12 only

Verdict

Worth studying if you’re building a similar API client wrapper—it’s a solid reference for handling pagination, retries, and batch parallelism. Don’t start new projects here; use landingai-ade instead. If you need offline document parsing, this was never the tool for you.

Frequently asked

What is landing-ai/agentic-doc?: LandingAI's Python wrapper for Agentic Document Extraction has been deprecated in favor of a new library, but the repo still holds 2,400 stars and a useful pattern for API client design.
Is agentic-doc open source?: Yes — landing-ai/agentic-doc is open source, released under the Apache-2.0 license.
What language is agentic-doc written in?: landing-ai/agentic-doc is primarily written in Python.
How popular is agentic-doc?: landing-ai/agentic-doc has 2.4k stars on GitHub.
Where can I find agentic-doc?: landing-ai/agentic-doc is on GitHub at https://github.com/landing-ai/agentic-doc.