huggingface/alignment-handbook
A collection of production-ready training recipes for fine-tuning and aligning language models with human and AI preferences.

Velocity · 7d
+5.5
★ / day
Trend
→steady
star history
The Alignment Handbook provides robust, community-facing training recipes that cover the full post-training pipeline for language models. It includes recipes for supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF), and direct preference optimization (DPO), along with guidance on dataset selection, training hyperparameters, and evaluation metrics for measuring helpfulness and safety.