OpenBioLink/ThoughtSource
Open-source framework providing chain-of-thought reasoning datasets and dataloaders for training and evaluating large language models.

ThoughtSource is a centralized open resource focused on chain-of-thought reasoning data for LLMs. It provides standardized dataloaders that format multiple datasets (commonsense QA, biomedical reasoning, etc.) with reasoning chains in Hugging Face datasets format. The repository also includes workflow tools and a tutorial notebook for working with reasoning data, targeting research into trustworthy and robust LLM reasoning capabilities.