← all repositories

kha-white/manga-ocr

A specialized OCR model using Vision Encoder Decoder transformers to recognize Japanese text in manga images.

2.7k stars Python Computer VisionML Frameworks
manga-ocr
Velocity · 7d
+1.7
★ / day
Trend
steady
star history

This repository provides an optical character recognition system specifically optimized for Japanese manga. It uses a custom end-to-end model based on Hugging Face Transformers’ Vision Encoder Decoder architecture. The system handles manga-specific challenges including vertical and horizontal text orientation, furigana annotations, text overlaid on images, diverse font styles, and low-quality images. Unlike typical OCR tools, it processes multi-line text bubbles in a single forward pass without requiring line-by-line splitting.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.