← all repositories

Jamie-Stirling/RetNet

A minimal pure PyTorch implementation of RetNet, an alternative neural architecture to Transformers for large language models.

1.2k stars Python Language ModelsML Frameworks
RetNet
Velocity · 7d
+1.2
★ / day
Trend
steady
star history

This repository provides a PyTorch implementation of the Retentive Network architecture described in the paper ‘Retentive Network: A Successor to Transformer for Large Language Models’. It implements single-scale and multi-scale retention mechanisms across parallel, recurrent, and chunkwise computation paradigms. The codebase includes multi-layer networks with feed-forward layers and layer normalization, plus a causal language model built on the retentive architecture. It prioritizes code correctness and readability over optimization.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.