← all repositories

CosmosShadow/gptpdf

A Python tool that leverages GPT-4o's visual capabilities to convert PDF documents into markdown format with support for complex layout elements.

3.6k stars Python Data Tooling
gptpdf
Velocity · 7d
+5.0
★ / day
Trend
steady
star history

The project parses PDFs by first using PyMuPDF to identify non-text areas in documents, then feeding those regions to a large visual model (GPT-4o) to extract content and convert it to markdown. It handles typography, mathematical formulas, tables, images, and charts with reportedly low cost (~$0.013 per page). The tool is implemented in about 293 lines of code and depends on a GeneralAgent library for OpenAI API interaction.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.