← all repositories

gordicaleksa/pytorch-original-transformer

A PyTorch implementation of the original Transformer architecture from the seminal Vaswani et al. paper, structured as a learning resource.

1.1k stars Jupyter Notebook ML FrameworksLearningLanguage Models
pytorch-original-transformer
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

This repository provides a clean, well-commented PyTorch implementation of the original Transformer model as described in the Attention Is All You Need paper. The code includes educational visualizations in playground.py for concepts like positional encodings and attention mechanisms. It ships with IWSLT pretrained models and is aimed at developers wanting to understand how transformers work under the hood.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.