← all repositories

huggingface/nanotron

A minimalistic library for pretraining transformer models with 3D-parallelism distributed training.

2.7k stars Python ML FrameworksLanguage Models
nanotron
Velocity · 7d
+2.7
★ / day
Trend
steady
star history

Nanotron provides a simple and flexible API for pretraining transformer models, particularly LLMs, on custom datasets. It is optimized for speed and scalability using 3D-parallelism techniques (combining tensor, pipeline, and data parallelism) to efficiently train large models across distributed compute resources. The library is designed to make large-scale model pretraining accessible while maintaining performance.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.