← all repositories

datadreamer-dev/DataDreamer

A Python library for generating synthetic datasets to train and align language models.

DataDreamer
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

DataDreamer is a Python library designed to facilitate the creation of synthetic training data and the alignment of language models. It provides tools for prompting, generating synthetic datasets, and training/fine-tuning models using those datasets. The library integrates with popular frameworks like PyTorch and Hugging Face Transformers, supporting workflows for instruction-tuning and model alignment.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.