← all repositories

CBIhalsen/PolyglotPDF

A multilingual eBook and PDF translator that preserves original layouts while leveraging LLMs for translation.

1.3k stars Python Domain AppsData Tooling
PolyglotPDF
Velocity · 7d
+1.6
★ / day
Trend
steady
star history

PolyglotPDF processes eBooks and PDFs in multiple formats, handling both scanned and digital documents. It preserves original layouts by parsing content into styled HTML with color and formatting information. Translation is performed by LLMs such as DeepSeek and OpenAI API, with support for mathematical formulas in LaTeX. The tool uses PyMuPDF for PDF parsing and can process complex documents like reports with tables.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.