← all repositories

jerryji1993/DNABERT

A pre-trained BERT-style transformer model that treats DNA sequences as a language for genome analysis.

756 stars Python Language ModelsDomain Apps
DNABERT
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

DNABERT applies bidirectional transformer architecture to genomic data, treating DNA k-mers as tokens analogous to words. The model provides pre-trained DNA embeddings that can be fine-tuned for downstream genomic tasks such as prediction, classification, and sequence understanding. It extends HuggingFace Transformers framework for DNA-specific preprocessing and training pipelines.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.