← all repositories

opendatalab/DocLayout-YOLO

A real-time document layout detection model based on YOLO-v10 trained on a 300K synthetic document dataset.

2.2k stars Python Computer VisionData Tooling
DocLayout-YOLO
Velocity · 7d
+3.6
★ / day
Trend
steady
star history

DocLayout-YOLO is a document layout detection system that identifies and localizes document elements like text blocks, images, tables, and figures in diverse document types. It introduces Mesh-candidate BestFit, a two-dimensional bin-packing approach for synthesizing large-scale labeled document data, and a Global-to-Local Controllability module for multi-scale detection. The model is pretrained on DocSynth-300K, a 300,000-sample diverse document dataset, and achieves real-time inference speeds while maintaining accuracy across varying document layouts.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.