← all repositories

OpenLMLab/MOSS-RLHF

Research framework for training and aligning large language models using Reinforcement Learning from Human Feedback (RLHF) with PPO.

1.4k stars Python Language ModelsLLMOps · Eval
MOSS-RLHF
Velocity · 7d
+1.3
★ / day
Trend
steady
star history

MOSS-RLHF is a research project focused on RLHF techniques for aligning large language models. Part I covers Proximal Policy Optimization (PPO) implementation for LLM fine-tuning, while Part II addresses reward modeling. The project provides code for training reward models and has released annotated datasets including a cleaned hh-rlhf dataset. It won the best paper award at NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.