← all repositories

GAIR-NLP/factool

A tool-augmented framework for detecting factual errors and hallucinations in text generated by large language models.

932 stars Python LLMOps · EvalAgents
factool
Velocity · 7d
+0.9
★ / day
Trend
steady
star history

FacTool evaluates the factuality of LLM outputs across four task types: knowledge-based QA, code generation, mathematical reasoning, and scientific literature review. It uses external tool calls to verify claims, queries, code execution, and citations against ground truth. The framework also incorporates Halu-J, an open-source model for hallucination detection, and provides benchmarks including ChineseFactEval for evaluating LLMs in Chinese.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.