huggingface/trl
A Hugging Face library for post-training foundation models using Supervised Fine-Tuning, GRPO, and Direct Preference Optimization.

Velocity · 7d
+8.2
★ / day
Trend
→steady
star history
TRL is a library built on top of the Transformers ecosystem that enables post-training foundation models using advanced techniques including Supervised Fine-Tuning (SFT), Group Relative Policy Optimization (GRPO), and Direct Preference Optimization (DPO). It provides trainers for various fine-tuning methods and supports scaling across different hardware setups and model architectures.