← all repositories

datamllab/LongLM

Self-Extend extends LLM context windows without tuning by adding binary attention stratification based on relative distance.

666 stars Python Language Models
LongLM
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

This repository implements Self-Extend, a technique that enables large language models to handle longer context windows without requiring fine-tuning. It works by grouping attention into local and global buckets based on relative token distance, allowing models to reason beyond their native attention span. The implementation includes optimized versions using FlashAttention and Triton, with support for various LLMs including Llama, Qwen, and Gemma.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.