abertsch72/unlimiformer
An implementation of retrieval-based attention augmentation for encoder-decoder transformers to handle unlimited-length inputs, supporting Llama-2 and its derivatives.

Velocity · 7d
+0.9
★ / day
Trend
→steady
star history
Unlimiformer augments pretrained encoder-decoder models with retrieval-based attention, allowing unlimited length inputs without changing the mathematical definition of attention. It can improve already-trained models or be used during training for best results. The implementation supports arbitrary pretrained models including Llama-2, enabling use cases like summarizing entire books.