Is long-form-factuality open source?

Yes — google-deepmind/long-form-factuality is an open-source project tracked on heatdrop.

What language is long-form-factuality written in?

google-deepmind/long-form-factuality is primarily written in Python.

How popular is long-form-factuality?

google-deepmind/long-form-factuality has 692 stars on GitHub.

Where can I find long-form-factuality?

google-deepmind/long-form-factuality is on GitHub at https://github.com/google-deepmind/long-form-factuality.

← all repositories

google-deepmind/long-form-factuality

LLMs Love to Ramble. This Grades the Facts.

Because a three-paragraph answer is useless if half the paragraphs are fiction.

★692 stars Python LLMOps · Eval Data Tooling

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

LongFact is a collection of 2,280 prompts engineered to make models generate lengthy, fact-dense responses. SAFE (Search-Augmented Factuality Evaluator) automatically verifies the factual claims in those responses. The repo also ships F1@K, a metric that stretches standard F1 scoring into long-form territory by baking human-preferred answer length into recall.

The interesting bit

Most benchmarks test short answers; this one tackles the messier problem of verifying accuracy across sprawling multi-sentence output without hiring a room of human fact-checkers. The main pipeline is pre-wired to benchmark OpenAI and Anthropic models.

Key highlights

2,280 prompts in longfact/ designed to elicit long, information-heavy answers
SAFE uses search-augmented verification to judge factual accuracy automatically
F1@K adapts precision/recall metrics for long-form text using human-preferred length
Includes a full experimentation pipeline for OpenAI and Anthropic model APIs
Most files have corresponding unit tests

Caveats

The benchmarking pipeline requires your own API keys for OpenAI and Anthropic, dropped into common/shared_config.py
Top-level documentation repeatedly points to subdirectory READMEs for real detail, so expect to click around

Verdict

Worth a look if you need to evaluate long-form generation and want to replace manual fact-checking with an automated pipeline. Not for you if your models only emit short, single-sentence answers.

Frequently asked

What is google-deepmind/long-form-factuality?: Because a three-paragraph answer is useless if half the paragraphs are fiction.
Is long-form-factuality open source?: Yes — google-deepmind/long-form-factuality is an open-source project tracked on heatdrop.
What language is long-form-factuality written in?: google-deepmind/long-form-factuality is primarily written in Python.
How popular is long-form-factuality?: google-deepmind/long-form-factuality has 692 stars on GitHub.
Where can I find long-form-factuality?: google-deepmind/long-form-factuality is on GitHub at https://github.com/google-deepmind/long-form-factuality.