← all repositories

CarperAI/trlx

A distributed training framework for fine-tuning large language models using Reinforcement Learning via Human Feedback (RLHF).

4.7k stars Python Language ModelsML Frameworks
trlx
Velocity · 7d
+3.5
★ / day
Trend
steady
star history

trlX is a framework designed from the ground up for fine-tuning large language models with reinforcement learning. It supports training via either a provided reward function or reward-labeled datasets. The framework supports distributed training across multiple devices and was published at EMNLP 2023.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.