← all repositories

openai/prm800k

A dataset of 800,000 step-level correctness labels on LLM-generated solutions to MATH problems for training and evaluating process reward models.

2.1k stars Python Data ToolingLLMOps · Eval
prm800k
Velocity · 7d
+1.9
★ / day
Trend
steady
star history

PRM800K is a process supervision dataset containing 800,000 step-level correctness labels for LLM-generated solutions to MATH problems. The dataset includes labels from human annotators across multiple phases, with quality control mechanisms to ensure label reliability. It was introduced in the paper ‘Let's Verify Step by Step’ and supports research into improving LLM mathematical reasoning through process-based reward models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.