← all repositories
gkiril/oie-resources

A bibliography that actually stays current

A living survey of Open Information Extraction research, papers, code, and datasets spanning 2006 to 2022.

oie-resources
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does This repository is a curated index of Open Information Extraction (OIE) resources — research papers, code, datasets, slides, talks, and PhD theses. OIE systems extract subject-relation-object triples from unstructured text without predefined relation schemas; the README walks through a concrete example with “AMD, which is based in U.S., is a technology company.”

The interesting bit The maintainer organizes papers both chronologically (year by year from 2006) and by functional category — surveys, evaluation methods, downstream applications (QA, slot filling, knowledge base construction), multilingual systems, canonicalization, and more. It’s a map of how a subfield evolved, not just a dump of links.

Key highlights

  • Covers 17 years of OIE research with direct PDF links and author attribution
  • Groups downstream uses into 14 application areas including event extraction, video grounding, and open link prediction
  • Tracks non-English OIE systems across 8 languages (German, Portuguese, Spanish, Chinese, Persian, Italian, Indonesian, Greek)
  • Includes code and data links where available (ClausIE, OLLIE, ReVerb resources, etc.)
  • Maintains sections for slides, talks, demos, and derived corpora

Caveats

  • The chronological list stops at 2022; anything newer is absent unless it appears in a category section
  • Some paper links point to ACL Anthology or author homepages, but others rely on CiteSeerX or Semantic Scholar mirrors that may rot
  • No explicit curation criteria stated — unclear how papers are selected or what “highly related” means in practice

Verdict Worth bookmarking if you’re doing literature review in information extraction or need to trace the lineage from TextRunner to modern neural approaches. Skip it if you want executable code or a tutorial; this is a reading list with signposts, not a framework.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.