ianarawjo/ChainForge
A visual programming environment for evaluating and comparing LLM prompts and responses across multiple models.

Velocity · 7d
+2.6
★ / day
Trend
→steady
star history
ChainForge is a data flow prompt engineering environment for analyzing and evaluating LLM responses. It enables rapid comparison of prompts, models, and response quality across multiple LLMs simultaneously. Built on ReactFlow and Flask, it allows users to set up evaluation metrics, visualize results across prompt variations and model settings, and includes AI-assisted features for generating test data and starter evaluation code.