← all repositories

X-PLUG/mPLUG-DocOwl

A multimodal LLM family developed by Alibaba for understanding documents, charts, and tables without OCR.

2.4k stars Python Language ModelsComputer Vision
mPLUG-DocOwl
Velocity · 7d
+2.2
★ / day
Trend
steady
star history

mPLUG-DocOwl is a series of multimodal large language models designed for document understanding tasks including chart comprehension, table extraction, and overall document analysis. The project includes multiple versions (DocOwl1.5, DocOwl2, TinyChart) with training and inference code, enabling users to fine-tune models on their own data for specialized document understanding applications.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.