← all repositories

confident-ai/deepeval

DeepEval is an open-source evaluation framework for testing and measuring the quality of LLM outputs.

16k stars Python LLMOps · Eval
deepeval
Velocity · 7d
+15
★ / day
Trend
steady
star history

DeepEval provides a Python-based framework for evaluating large language model outputs against configurable metrics. It offers built-in evaluation criteria and a test-runner workflow for systematically assessing LLM performance. The framework integrates with various LLM backends and provides tooling for developers to benchmark and validate AI applications.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.