The ancient ancestor of every chatbot retrieval model
A 2016 TensorFlow implementation that turned two LSTMs into a conversation-matching engine for Ubuntu support chats.

What it does
This repo trains a neural network to pick the best response from a fixed set of candidates rather than generating text from scratch. Feed it a conversation context and a pool of possible replies; it ranks them by how well they match. The training data is the Ubuntu Dialog Corpus, so the domain is strictly tech support chatter.
The interesting bit
The “dual encoder” architecture is the hook: one LSTM processes the conversation history, another processes the candidate response, and the model learns to push matching pairs together in vector space. It is the same retrieval-based idea that still powers many production chatbots, just with 2016-era TensorFlow and far fewer layers.
Key highlights
- Implements the Dual LSTM Encoder from the Lowe et al. Ubuntu Dialogue Corpus paper
- Pure retrieval: no text generation, so replies are always coherent (if limited to the candidate pool)
- Ships with train / test / predict scripts and a companion blog post explaining the mechanics
- Requires TensorFlow >= 0.9, which dates it firmly to the TF 1.x era
- Data lives on Google Drive, not in the repo itself
Caveats
- The README has a duplicate “Evaluation” header and a typo (“acrhive”), suggesting maintenance stopped long ago
- TensorFlow 0.9 is archaeological at this point; expect dependency archaeology to get it running
Verdict
Worth a look if you are studying how retrieval-based chatbots evolved or need a clean, simple baseline to beat. Skip it if you want something production-ready or modern; the field has moved to transformers and dense passage retrieval.