lilianweng/transformer-tensorflow
A TensorFlow implementation of the original Transformer model for sequence-to-sequence tasks like machine translation.

This repository provides a complete implementation of the Transformer architecture in TensorFlow, including encoder/decoder layers, multi-head self-attention mechanisms, and positional encoding. It includes training and evaluation scripts for machine translation tasks on standard benchmarks like WMT14 and IWSLT15. The project implements the core components — attention, feed-forward layers, residual connections, and label smoothing — that form the backbone of modern large language models.