← all repositories

google-research/bigbird

Sparse-attention transformer model extending BERT to process much longer sequences for NLP tasks.

634 stars Python Language ModelsML Frameworks
bigbird
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

BigBird is a transformer architecture with sparse attention mechanisms designed to handle sequences significantly longer than standard BERT. It provides theoretical guarantees about what tasks the sparse approximation can capture while drastically improving performance on NLP tasks like question answering and summarization. The codebase includes core attention implementations, encoder stacks, and packaged BERT and seq2seq models with BigBird attention.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.