← all repositories
HKUST-KnowComp/R-Net

A 2017 SQuAD model, rebuilt with Transformer-era attention

This is a faithful TensorFlow re-implementation of Microsoft's R-Net that quietly swaps in scaled multiplicative attention and CuDNN GRUs to make a memory-hungry architecture trainable.

577 stars Python Language Models
R-Net
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

R-Net is a neural reading comprehension model: given a passage and a question, it locates the answer span. This repo targets the SQuAD dataset specifically and reproduces the original paper’s EM/F1 scores almost exactly (71.07/79.51 vs. 71.1/79.5).

The interesting bit

The author didn’t just port the paper. Additive attention, as originally specified, is a memory brute; this implementation substitutes the scaled multiplicative attention from “Attention Is All You Need” and layers in variational dropout and residual-style concatenation to keep stacked RNNs from degrading. CuDNN GRU plus bucketing drops training time on a TITAN X from 2.56 s/it to 0.28 s/it—though bucketing costs you 0.3% F1, which feels like a fair trade.

Key highlights

  • Reproduces original R-Net scores on SQuAD v1.1 to within 0.03 EM
  • Scaled multiplicative attention replaces the memory-intensive additive variant
  • CuDNN GRU + bucketing yields ~9× speedup over naive CPU training
  • Learning rate halves automatically when dev loss plateaus
  • Optional extensions: pretrained GloVe character embeddings, FastText vectors (reported +1% F1)

Caveats

  • Locked to TensorFlow ≥1.5.0 and spaCy ≥2.0.0; the README warns of “a lot of known problems caused by using different software versions”
  • No code or documentation beyond the README; you’ll need to read config.py to toggle optional extensions
  • Bucketing is faster but demonstrably slightly less accurate

Verdict

Worth a look if you’re studying the SQuAD leaderboard’s 2017 vintage or need a working, optimized R-Net baseline. Skip it if you want a maintained, modern transformer-based reader—this is a period piece with some thoughtful engineering retrofits.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.