kimiyoung/transformer-xl
PyTorch and TensorFlow implementation of Transformer-XL, a recurrent transformer language model that extends context beyond fixed-length segments.

This repository provides the official implementations of Transformer-XL in both PyTorch and TensorFlow, along with pretrained models. Transformer-XL introduced segment-level recurrence and relative positional encoding to handle longer sequences beyond fixed-length contexts. It achieved state-of-the-art results on multiple language modeling benchmarks including enwiki8, text8, One Billion Word, WikiText-103, and Penn Treebank.