← all repositories
bshao001/ChatLearner

A chatbot that can do math (because neural nets can't)

ChatLearner bolts rule-based reasoning onto TensorFlow's seq2seq model so it can tell time, solve arithmetic, and read jokes—things pure neural chatbots flunk.

544 stars Python Chat AssistantsLanguage Models
ChatLearner
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does ChatLearner trains a conversational agent in TensorFlow 1.4–1.11 using the then-new NMT seq2seq architecture. The hook: it layers hand-coded rules on top so the bot can handle tasks neural models alone botch, like arithmetic, date lookups, and story retrieval. A curated “Papaya” dataset mixes handcrafted persona data with cleaned Cornell movie dialogs and Reddit comments. SOAP and REST APIs wrap the model, with a Java GUI client included.

The interesting bit The author treats deep learning as a language-modeling layer, not an oracle. The rules aren’t an afterthought—they’re the admission that “no matter how powerful a deep learning model can be, it cannot even answer questions requiring simple arithmetic calculations.” The dataset is also unusually opinionated: the bot is trained to play a polite, philosophical 9-year-old named Papaya.

Key highlights

  • Custom “Papaya” dataset with persona-consistent handcrafted samples plus cleaned Cornell and Reddit data
  • Rule integration for math, time/date, and random content retrieval (stories, jokes)
  • In-graph lowercasing solution for TensorFlow’s tf.data TextLineDataset
  • Both SOAP and REST web service wrappers, with a Java GUI reference implementation
  • Legacy seq2seq branch available for comparison

Caveats

  • Locked to TensorFlow 1.4–1.11; the tf.data API changes in 1.12 break compatibility
  • Requires manual PYTHONPATH setup and careful vocab.txt consistency between training and inference
  • The author warns that vocabulary size, not training sample count, is the real capacity bottleneck

Verdict Worth a look if you’re building a chatbot and need a practical example of hybrid neural/rule architecture—especially for constrained domains. Skip if you need modern TensorFlow or an out-of-the-box production system; this is a research/educational snapshot from 2017-era tooling.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.