← all repositories

lucidrains/local-attention

A PyTorch implementation of local windowed attention mechanisms for efficient transformer-based language modeling.

499 stars Python ML FrameworksLanguage Models
local-attention
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

This repository provides a PyTorch implementation of local windowed attention, a foundational transformer component that restricts attention computation to fixed-size windows for efficient language modeling. It supports causal masking, relative positional encoding, and shared query/key space for Reformer-style architectures. The code is designed as a reusable building block for training transformer-based language models, with a focus on providing an incredibly strong baseline through local attention in bottom transformer layers.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.