← all repositories
thunlp/FewRel

A benchmark that makes relation extraction starve for data on purpose

FewRel forces NLP models to learn entity relationships from a handful of examples, then tests whether they actually generalize.

FewRel
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

FewRel is a dataset and benchmark for few-shot relation extraction: given five (or ten) relation types and just one to five examples of each, your model must figure out which relation holds between two entities in a sentence. It ships with 100+ relations, tens of thousands of annotated instances, and baseline implementations including Prototypical Networks and a BERT-based PAIR model.

The interesting bit

The project deliberately withholds the test set—you submit your model to their leaderboard for evaluation, which keeps the benchmark honest. FewRel 2.0 then piles on two extra headaches: domain adaptation (Wikipedia → PubMed) and “none-of-the-above” detection, where some query instances match none of the provided relations.

Key highlights

  • Two benchmark tracks: FewRel 1.0 (standard few-shot) and FewRel 2.0 (adds domain adaptation + NOTA detection)
  • Baseline models include Proto-CNN and BERT-PAIR, with reproduction commands and reported numbers in the README
  • Supports configurable N-way K-shot settings, multiple encoders (CNN, BERT), and adversarial training for domain shift
  • Hidden test sets with public leaderboards; validation sets available for local tuning
  • Pre-trained embeddings and BERT checkpoint downloadable via provided script

Caveats

  • Test data is intentionally withheld, so you cannot run full offline evaluation without submitting to their website
  • The repo contains data but not pre-trained files (GloVe, BERT); you need to run download_pretrain.sh
  • --fp16 requires NVIDIA Apex, which is an extra dependency not handled by standard pip

Verdict

Researchers working on few-shot learning for NLP or relation extraction specifically should grab this. If you need an off-the-shelf relation extractor with abundant training data, or you refuse to submit to external leaderboards, look elsewhere.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.