← all repositories

argilla-io/distilabel

A framework for generating synthetic training datasets and collecting AI feedback through scalable RLHF and RLAIF pipelines.

3.2k stars Python Data ToolingLLMOps · Eval
distilabel
Velocity · 7d
+3.4
★ / day
Trend
steady
star history

Distilabel provides engineers with pipelines for synthesizing AI training data and collecting model feedback based on verified research papers. It supports RLHF (Reinforcement Learning from Human Feedback) and RLAIF (AI Feedback) workflows for training and fine-tuning language models. The framework emphasizes speed, reliability, and scalability in dataset generation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.