← all repositories

langchain-ai/openevals

A library of pre-built LLM-based evaluators for scoring the quality of outputs from LLM applications.

1.1k stars Python LLMOps · Eval
openevals
Velocity · 7d
+2.2
★ / day
Trend
steady
star history

OpenEvals provides pre-built evaluator prompts and LLM-as-judge pipelines for evaluating LLM application outputs. It offers metrics like conciseness, correctness, and helpfulness scoring using an LLM judge (defaulting to GPT-4) to automatically score outputs against defined criteria. Available in both Python and TypeScript, it serves as a starting point for developers writing evals for their production LLM systems.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.