← all repositories

lucidrains/recurrent-memory-transformer-pytorch

A PyTorch implementation of the Recurrent Memory Transformer architecture for processing long sequences using memory tokens.

423 stars Python ML FrameworksLanguage Models
recurrent-memory-transformer-pytorch
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

This repository implements the Recurrent Memory Transformer (RMT) paper in PyTorch, introducing memory tokens that allow transformers to handle very long contexts by compressing information across segments. The architecture passes memory embeddings between segments, enabling information retention across sequences of arbitrary length. It includes support for flash attention and is designed for autoregressive sequence modeling tasks.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.