← all repositories

abertsch72/unlimiformer

An implementation of retrieval-based attention augmentation for encoder-decoder transformers to handle unlimited-length inputs, supporting Llama-2 and its derivatives.

1.1k stars Python Language ModelsML Frameworks
unlimiformer
Velocity · 7d
+0.9
★ / day
Trend
steady
star history

Unlimiformer augments pretrained encoder-decoder models with retrieval-based attention, allowing unlimited length inputs without changing the mathematical definition of attention. It can improve already-trained models or be used during training for best results. The implementation supports arbitrary pretrained models including Llama-2, enabling use cases like summarizing entire books.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.