datadreamer-dev/DataDreamer
A Python library for generating synthetic datasets to train and align language models.

Velocity · 7d
+1.0
★ / day
Trend
→steady
star history
DataDreamer is a Python library designed to facilitate the creation of synthetic training data and the alignment of language models. It provides tools for prompting, generating synthetic datasets, and training/fine-tuning models using those datasets. The library integrates with popular frameworks like PyTorch and Hugging Face Transformers, supporting workflows for instruction-tuning and model alignment.