← all repositories

aub-mind/arabert

Pre-trained transformer models (BERT, GPT2, ELECTRA variants) specifically trained on Arabic text data.

721 stars Python Language Models
arabert
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

This repository provides multiple pre-trained Arabic language models including AraBERT (v0.1/v1 and v0.2/v2), AraGPT2 (base, medium, large, MEGA variants), and AraELECTRA. These models are trained from scratch on large Arabic corpora and are designed for Arabic natural language understanding and generation tasks. The project includes a preprocessor (AraBERTPreprocessor) and can be installed via pip, with models available through Hugging Face Transformers.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.