← all repositories

allenai/RL4LMs

A reinforcement learning library for fine-tuning language models to optimize human preference reward functions.

2.4k stars Python Language ModelsML Frameworks
RL4LMs
Velocity · 7d
+1.7
★ / day
Trend
steady
star history

RL4LMs provides modular building blocks for training language models with reinforcement learning, including on-policy algorithms (PPO, A2C, TRPO, NLPO), reward functions, and 20+ NLG metrics. It supports causal LMs (GPT-2/3) and seq2seq LMs (T5, BART) across NLP tasks including summarization, translation, dialogue generation, and question answering. The library has been benchmarked across 2000+ experiments on the GRUE benchmark.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.