GAIR-NLP/factool
A tool-augmented framework for detecting factual errors and hallucinations in text generated by large language models.

Velocity · 7d
+0.9
★ / day
Trend
→steady
star history
FacTool evaluates the factuality of LLM outputs across four task types: knowledge-based QA, code generation, mathematical reasoning, and scientific literature review. It uses external tool calls to verify claims, queries, code execution, and citations against ground truth. The framework also incorporates Halu-J, an open-source model for hallucination detection, and provides benchmarks including ChineseFactEval for evaluating LLMs in Chinese.