← all repositories

amazon-science/RAGChecker

A fine-grained evaluation framework for diagnosing RAG system performance using claim-level entailment metrics.

1.1k stars Python LLMOps · EvalRAG · Search
RAGChecker
Velocity · 7d
+1.5
★ / day
Trend
steady
star history

RAGChecker is an automatic evaluation framework designed to assess and diagnose Retrieval-Augmented Generation systems. It provides comprehensive metrics including overall system assessment, diagnostic retriever metrics, and diagnostic generator metrics. The framework utilizes claim-level entailment operations for fine-grained evaluation and includes a benchmark dataset with thousands of questions across multiple domains.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.