← all repositories

sparkfish/augraphy

Python library that generates synthetic distorted document images for training machine learning models.

544 stars Python Data ToolingComputer Vision
augraphy
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

Augraphy is an augmentation pipeline library that creates realistic synthetic documents simulating printing, scanning, faxing and copying processes. It applies configurable transformations to clean documents to produce degraded versions, generating large volumes of paired training data for document processing neural networks. This reverses the typical data problem by starting from known-good originals and degrading them, providing ground-truth pairs for training models that remove document distortions.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.