← all repositories

microsoft/DeBERTa

Microsoft's implementation of DeBERTa, a decoding-enhanced BERT model with disentangled attention and ELECTRA-style pre-training.

2.2k stars Python Language ModelsML Frameworks
DeBERTa
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

This repository provides the official implementation of DeBERTa and DeBERTa V3, transformer-based language models that improve on BERT through disentangled attention mechanisms. The implementation includes code for model pre-training, continuous training, and fine-tuning on downstream NLP tasks including SuperGLUE benchmarks. Models ranging from 22M to 1.5B parameters are available via the Hugging Face model hub.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.