← all repositories

CLUEbenchmark/CLUEDatasetSearch

A searchable catalog of Chinese and English NLP datasets with categorization by task types like NER, QA, sentiment analysis, and text classification.

4.4k stars Python Data ToolingLearning
CLUEDatasetSearch
Velocity · 7d
+1.9
★ / day
Trend
steady
star history

This repository aggregates and indexes NLP datasets in Chinese and English, organized by task categories including named entity recognition, question answering, sentiment analysis, text classification, text matching, text summarization, machine translation, and knowledge graphs. It provides a searchable interface and accepts community contributions for dataset additions. The project also references a companion clueai toolkit for zero-shot NLP development.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.