← all repositories
charlesXu86/Chatbot_CN

A Chinese chatbot ecosystem that grew too big for one repo

Thirteen sub-projects, one umbrella: this is what happens when a task-oriented bot metastasizes into a full-stack NLP platform.

Chatbot_CN
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does

Chatbot_CN is a Chinese-language conversational AI stack targeting finance and legal domains, with chitchat as a side gig. It spans the full pipeline: text correction, NER, syntax parsing, NLU, dialogue management, knowledge graphs, retrieval-based fallback, and even voice hardware integration via Raspberry Pi. The author split it into 13 sub-repositories—RASA-based dialogue engine, Scrapy crawler, evaluation suite, third-party connectors for DingTalk and WeChat, the works.

The interesting bit

The project outgrew itself. What started as a single repo became a meta-project where Chatbot_CN itself holds only documentation; the code lives elsewhere. The author is frank that some modules exist “just for completeness” (the end-to-end seq2seq model) and that the recommendation module is still in planning. That honesty is refreshing in a field of overpromising demo-ware.

Key highlights

  • Core dialogue engine built on RASA, with custom extensions (AutoDL, model compression, teacher-student distillation)
  • NLP utilities exposed as RESTful APIs: BERT-based text correction, entity recognition, coreference resolution
  • Retrieval fallback using inverted index + BERT fine-tuning for FAQ and intent-miss cases
  • Skill manager handles task switching mid-dialogue (jumping from Task A to Task B without losing state)
  • Voice module and hardware integration for embedded use
  • Botfront integration for model/intent management and web-based interaction

Caveats

  • The main repo contains no runnable code since January 2020; you must clone and wire up multiple sub-projects
  • The author notes “many details need improvement” and that some users report the full system won’t start
  • Documentation is in Chinese; English speakers are on their own
  • Several listed modules (recommendation, some analytics) are planned or incomplete

Verdict

Worth studying if you’re building a Chinese task-oriented bot and want a reference architecture for how the pieces fit—NLU, KG, retrieval, RASA core, evaluation loops. Not a drop-in solution; expect to fork, patch, and translate. If you need a working chatbot this afternoon, look elsewhere.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.