← all repositories

vectara/hallucination-leaderboard

A public leaderboard that evaluates and ranks LLMs based on hallucination rates when summarizing documents.

3.3k stars Python LLMOps · EvalLanguage Models
hallucination-leaderboard
Velocity · 7d
+3.4
★ / day
Trend
steady
star history

The repository provides a standardized evaluation framework for measuring how often different LLMs introduce hallucinations during document summarization. It uses Vectara’s Hallucination Evaluation Model (HHEM) to score models on metrics including hallucination rate, factual consistency rate, and answer rate. Results are regularly updated and presented as an interactive leaderboard on Hugging Face with tabular data showing performance across dozens of LLMs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.