← all repositories

BatsResearch/bonito

An open-source model that converts unannotated text into synthetic instruction tuning datasets for training LLMs via zero-shot task adaptation.

825 stars Python Language ModelsData Tooling
bonito
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

Bonito is an open-source model for conditional task generation that converts unannotated text into task-specific training datasets for instruction tuning. It is built on top of Hugging Face transformers and vLLM libraries, using LLM inference to synthesize training data from raw text. The generated datasets enable fine-tuning other language models for zero-shot task adaptation without manual annotation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.