← all repositories

wasiahmad/Awesome-LLM-Synthetic-Data

A curated reading list of papers, tools, and blogs on using LLMs to generate synthetic data for training and improving language models.

Awesome-LLM-Synthetic-Data
Velocity · 7d
+2.3
★ / day
Trend
steady
star history

This repository compiles research and resources on synthetic data generation using large language models. It covers methods and applications across mathematical reasoning, code generation, alignment, reward modeling, long-context understanding, multi-modal tasks, and agent systems. Organized as an awesome-style list, it serves as a reference for researchers and practitioners working on LLM training data creation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.