← all repositories

ymcui/Chinese-BERT-wwm

Chinese pre-trained language models using Whole Word Masking technique, including BERT-wwm and RoBERTa-wwm variants.

10.2k stars Python Language Models
Chinese-BERT-wwm
Velocity · 7d
+4.0
★ / day
Trend
steady
star history

This repository hosts Chinese pre-trained language models based on Whole Word Masking (WWM) methodology. It provides multiple model variants including BERT-wwm, BERT-wwm-ext, RoBERTa-wwm-ext, RoBERTa-wwm-ext-large, RBT3, and RBTL3. The models are built on Google’s official BERT implementation and support both PyTorch and TensorFlow frameworks. These models are designed to improve Chinese NLP tasks by masking entire words rather than individual subword tokens during pre-training.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.