← all repositories

unitaryai/detoxify

BERT-based toxicity classifier that predicts toxic comments across three Jigsaw challenges.

detoxify
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

Detoxify provides trained models and code for multi-label toxicity classification of text comments. It uses transformer-based architectures (BERT, ALBERT) with PyTorch Lightning for training and supports three model variants: original, unbiased, and multilingual. The models classify comments into toxicity categories such as toxic, severe_toxic, obscene, threat, insult, and identity_attack, with AUC scores reaching 93%+ on benchmark tests.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.