← all repositories

mit-han-lab/lite-transformer

A lightweight transformer architecture with Long-Short Range Attention for efficient NLP tasks.

611 stars Python Language ModelsML Frameworks
lite-transformer
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

This repository implements the Lite Transformer, a research paper from MIT’s Han Lab published at ICLR 2020. The architecture introduces Long-Short Range Attention to improve efficiency over standard transformers. The implementation builds on fairseq and includes custom CUDA-accelerated convolution layers (lightconv and dynamicconv). It supports standard NLP benchmarks including IWSLT14 and WMT translation tasks.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.