← all repositories

cleanlab/cleanlab

A Python library that detects and fixes data quality issues in ML datasets using your existing models.

11.5k stars Python Data ToolingML Frameworks
cleanlab
Velocity · 7d
+3.9
★ / day
Trend
steady
star history

Cleanlab analyzes datasets to automatically identify labeling errors, outliers, duplicate entries, and other data quality problems that degrade model performance. It works with any existing ML model to score data points based on model behavior, then surfaces issues that can be addressed through data cleaning, reannotation, or curation to train better models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.