cleanlab/cleanlab
A Python library that detects and fixes data quality issues in ML datasets using your existing models.

Velocity · 7d
+3.9
★ / day
Trend
→steady
star history
Cleanlab analyzes datasets to automatically identify labeling errors, outliers, duplicate entries, and other data quality problems that degrade model performance. It works with any existing ML model to score data points based on model behavior, then surfaces issues that can be addressed through data cleaning, reannotation, or curation to train better models.