vibrantlabsai/ragas
A Python toolkit for evaluating and optimizing LLM applications through objective metrics and intelligent test generation.

Velocity · 7d
+13
★ / day
Trend
→steady
star history
Ragas provides metrics and tooling for evaluating LLM application outputs. It offers objective evaluation metrics, intelligent test case generation for LLM pipelines, and data-driven insights to assess application quality. The toolkit supports benchmarking and monitoring of LLM-based systems, helping developers identify issues in their RAG pipelines, chat applications, and other LLM-powered software.