← all repositories

agentscope-ai/OpenJudge

A unified evaluation framework for assessing AI agent quality and converting grading results into RLHF reward signals.

643 stars Python LLMOps · EvalAgents
OpenJudge
Velocity · 7d
+1.9
★ / day
Trend
steady
star history

OpenJudge is an open-source framework designed to evaluate AI applications, particularly AI agents and chatbots. It provides ready-to-use graders and supports generating scenario-specific rubrics to assess application quality. The framework can convert grading results into reward signals that are used to fine-tune and optimize applications through RLHF workflows. It aims to simplify the evaluation workflow from data collection to weakness analysis and rapid iteration.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.