← all repositories

ymcui/MacBERT

A Chinese pre-trained language model (MacBERT) that improves BERT-style pre-training with a corrected masked language model objective for reduced pretrain-finetune discrepancy.

715 stars Language Models
MacBERT
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

MacBERT is a pre-trained Chinese language model developed by HFL (Hit-Fudan-Lab). It introduces a corrected masked language model (Mac) pre-training task that replaces [MASK] tokens with similar words based on n-gram matching rather than random substitution, reducing the gap between pre-training and fine-tuning. The model is compatible with Hugging Face Transformers and supports various Chinese NLP downstream tasks including text classification, named entity recognition, and question answering.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.