Belval/TextRecognitionDataGenerator
A Python tool that generates synthetic text images with configurable fonts, backgrounds, and distortions to create training data for OCR systems.

Velocity · 7d
+1.1
★ / day
Trend
→steady
star history
TextRecognitionDataGenerator creates synthetic text images to train optical character recognition models. It supports multiple languages including non-Latin scripts, configurable fonts, backgrounds, and text distortions to produce realistic training data. The tool can be used via CLI, pip package, or Docker, allowing users to generate datasets at scale for training custom OCR systems.