wasiahmad/Awesome-LLM-Synthetic-Data
A curated reading list of papers, tools, and blogs on using LLMs to generate synthetic data for training and improving language models.

Velocity · 7d
+2.3
★ / day
Trend
→steady
star history
This repository compiles research and resources on synthetic data generation using large language models. It covers methods and applications across mathematical reasoning, code generation, alignment, reward modeling, long-context understanding, multi-modal tasks, and agent systems. Organized as an awesome-style list, it serves as a reference for researchers and practitioners working on LLM training data creation.