← all repositories

daochenzha/data-centric-AI

A curated list of research papers, tutorials, and resources on data-centric AI, a discipline that improves ML models by improving training data quality.

1.1k stars LearningData Tooling
data-centric-AI
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

This repository aggregates academic survey papers, perspective papers, tutorials (including a KDD 2023 tutorial), and blog posts on data-centric AI techniques. It serves as a reference collection for researchers and practitioners interested in data curation, data quality, and data engineering approaches to machine learning. The list covers topics like data-centric concepts behind foundation models (GPT, Segment Anything) and future directions in data-centric ML.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.