← all repositories

tomaarsen/attention_sinks

A library that modifies pre-trained LLMs to generate fluent text indefinitely beyond their original training context using attention sink mechanisms.

attention_sinks
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

The library adapts existing transformer-based LLMs to use a sliding window attention variant that maintains the ability to produce coherent text over arbitrarily long sequences. It does not require retraining — modifications are applied post-hoc to the attention mechanism. The project provides benchmark code comparing perplexity across multiple model families including Llama-2, Falcon, Mistral, and GPT-J under long-context generation scenarios.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.