← all repositories

OpenBioLink/ThoughtSource

Open-source framework providing chain-of-thought reasoning datasets and dataloaders for training and evaluating large language models.

1k stars Jupyter Notebook Language ModelsData ToolingLearning
ThoughtSource
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

ThoughtSource is a centralized open resource focused on chain-of-thought reasoning data for LLMs. It provides standardized dataloaders that format multiple datasets (commonsense QA, biomedical reasoning, etc.) with reasoning chains in Hugging Face datasets format. The repository also includes workflow tools and a tutorial notebook for working with reasoning data, targeting research into trustworthy and robust LLM reasoning capabilities.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.