X-PLUG/mPLUG-DocOwl
A multimodal LLM family developed by Alibaba for understanding documents, charts, and tables without OCR.

Velocity · 7d
+2.2
★ / day
Trend
→steady
star history
mPLUG-DocOwl is a series of multimodal large language models designed for document understanding tasks including chart comprehension, table extraction, and overall document analysis. The project includes multiple versions (DocOwl1.5, DocOwl2, TinyChart) with training and inference code, enabling users to fine-tune models on their own data for specialized document understanding applications.