cleanlab/cleanvision
A Python library that automatically audits image datasets to find quality issues like duplicates, blur, and exposure problems.

Velocity · 7d
+0.8
★ / day
Trend
→steady
star history
CleanVision is a data-centric AI package for automatically detecting potential issues in image datasets before applying machine learning. It checks for predefined issues including blurry images, under/over-exposed images, near-duplicates, and other visual quality problems. Users simply point it at a folder of images and run a few lines of Python to generate a report of dataset issues that should be addressed prior to model training.