← all repositories

facebookresearch/lingua

A minimal, fast PyTorch-based library for training and inference of large language models designed for research experimentation.

lingua
Velocity · 7d
+7.9
★ / day
Trend
steady
star history

Meta Lingua is a lean research codebase for LLM development that enables end-to-end training, inference, and evaluation of language models. It provides easy-to-modify PyTorch components allowing researchers to experiment with new architectures, losses, and data pipelines. The library includes tools for data downloading and preparation from sources like FineWeb and DCLM datasets, and supports tokenizer setup.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.