← all repositories

CLUEbenchmark/CLUENER2020

CLUENER2020 is a Chinese fine-grained named entity recognition benchmark dataset with 10 entity categories.

1.5k stars Python Data ToolingLanguage Models
CLUENER2020
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

The repository provides a labeled dataset for fine-grained named entity recognition in Chinese text, covering 10 entity types including address, book, company, game, government, movie, name, organization, position, and scene. It includes baseline implementations using pre-trained language models such as BERT, RoBERTa, and ALBERT for sequence labeling tasks. The dataset serves as a benchmark for training and evaluating NER models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.