← all repositories

bespokelabsai/curator

Python library for bulk inference and scalable synthetic data curation for LLM post-training.

curator
Velocity · 7d
+2.9
★ / day
Trend
steady
star history

The repository provides a framework for generating and curating synthetic datasets used in post-training language model pipelines. It supports bulk inference workflows for scalable data extraction and structured dataset generation, including tools for instruction-tuning and integration with fine-tuning frameworks like LoRA.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.