PiotrNawrot/nanoT5
A PyTorch training pipeline for pre-training and fine-tuning T5 encoder-decoder models on a single GPU.

Velocity · 7d
+0.9
★ / day
Trend
→steady
star history
The repository enables researchers to pre-train and fine-tune T5-style encoder-decoder language models from scratch on a single A100 GPU in approximately 16 hours, achieving competitive performance on the Super-Natural Instructions benchmark. It optimizes the entire training pipeline including mixed precision, gradient accumulation, and data loading to serve as a user-friendly template for NLP research and applications.