← all repositories

kotaro-kinoshita/yomitoku

A deep-learning-based document image analysis engine specialized for Japanese text, providing OCR and layout parsing.

1.4k stars Python Computer VisionData Tooling
yomitoku
Velocity · 7d
+2.4
★ / day
Trend
steady
star history

YomiToku is a Python package that uses four independently trained deep learning models to perform text detection, string recognition, layout analysis, and table structure extraction from Japanese document images. It supports over 7000 Japanese characters, including handwritten and vertical writing layouts, and outputs results in HTML, Markdown, JSON, or CSV formats.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.