← all repositories

code-kern-ai/refinery

An open-source platform for scaling, assessing, and maintaining natural language training data with annotation and active learning capabilities.

1.5k stars Python Data ToolingML Frameworks
refinery
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

Refinery is an open-source labeling tool for natural language processing that enables data scientists to scale and maintain training datasets. It supports text annotation, classification, and annotation workflows with integration for spaCy and transformers. The platform emphasizes treating training data as a software artifact with built-in quality assessment and active learning features.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.