A seq2seq time capsule from 2018
A straightforward TensorFlow 1.x implementation of encoder-decoder summarization that shows its work.

What it does
Trains a neural summarizer on news articles and their headlines using the classic encoder-decoder playbook: bidirectional LSTM encoder, LSTM decoder with Bahdanau attention, beam search at inference time. It reads article text, compresses it to a headline-length summary. The code is explicit about every layer choice and links directly to the TensorFlow 1.x contrib.seq2seq APIs it calls.
The interesting bit
This is essentially a readable reference implementation, not a framework. It wires together stack_bidirectional_dynamic_rnn, BasicDecoder, and BeamSearchDecoder the long way—useful if you’re trying to understand how those pieces fit before they were swallowed by Keras and TF 2.x. The sample outputs show the model mostly succeeds at extracting the who-what-when from newswire text, occasionally hallucinating a synonym or dropping a number.
Key highlights
- GloVe initialization for word embeddings (optional, via
--gloveflag) - Configurable depth, width, beam width, and dropout from CLI
--toyflag for quick smoke tests on 5K samples- Pre-trained checkpoint available for immediate inference
- ROUGE evaluation via external
files2rougetool
Caveats
- Locked to TensorFlow ≥1.8.0 and Python 3; the
contribAPIs this depends on are deprecated or removed in modern TensorFlow - No mention of training time, hardware requirements, or achieved ROUGE scores in the README
- Dataset comes from an external Harvard NLP repo that may or may not still host the files
Verdict
Worth a look if you’re teaching or learning seq2seq mechanics and need code that maps cleanly to textbook diagrams. Skip it if you want a maintained, production-ready summarizer—this is a fossil from the tf.contrib era, and modern abstractive summarization has moved on to transformers.