← all repositories

natolambert/rlhf-book

An open-source textbook on Reinforcement Learning from Human Feedback covering post-training language model techniques.

2k stars Python Learning
rlhf-book
Velocity · 7d
+2.6
★ / day
Trend
steady
star history

This repository contains the source material for a comprehensive textbook on RLHF, documenting techniques used to align and improve language models after initial pre-training. The book covers rejection sampling, preference modeling, reward modeling, and character training methodologies. It serves as an educational reference for practitioners working at the frontier of open language model development.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.