A 2016 neural QA model that still demands 12GB of GPU
The original BiDAF implementation for SQuAD, frozen in TensorFlow r0.11 amber.

What it does
BiDAF (Bi-Directional Attention Flow) answers questions about a passage of text by modeling context at multiple granularities and using bidirectional attention to build query-aware representations. This is the original AllenAI implementation that scored 68.0% EM / 77.3% F1 on the SQuAD test set with a single model, or 73.3% / 81.1% ensembled.
The interesting bit
The attention mechanism flows both directions—query-to-context and context-to-query—without early summarization, so the model retains more fine-grained alignment information than one-directional approaches. At ~2.5M parameters it was relatively compact for its era, though it still required a Titan X and 12GB VRAM to train.
Key highlights
- Original research code, not a maintained framework; TensorFlow r0.11 only, with a
devbranch for v1.2 - Multi-GPU training supported via gradient averaging across GPUs with smaller per-device batches
- Pre-trained weights available via CodaLab worksheet for leaderboard reproduction
- Includes unofficial (harsher) evaluation during training; official SQuAD evaluator bundled separately
- Demo code lives on a separate branch, not in main
Caveats
- Requires obsolete TensorFlow r0.11; likely painful to run on modern stacks
- Python 2 explicitly unsupported, Python 3.5.2 only verified version
- 12GB GPU RAM minimum for default batch size; smaller GPUs need workarounds
Verdict
Worth studying if you’re tracing the evolution of neural QA architectures or reproducing classic SQuAD baselines. Skip it if you need a modern, maintained reading comprehension model—this is a research artifact from 2016, not a starting point for new work.