← all repositories
Doragd/Chinese-Chatbot-PyTorch-Implementation

A student chatbot that admits its flaws

Coursework project documenting the painful gap between reading PyTorch tutorials and actually training a Chinese seq2seq model that converges.

917 stars Python Chat AssistantsLanguage Models
Chinese-Chatbot-PyTorch-Implementation
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

A Chinese chatbot built on the qingyun corpus (100k casual dialogue pairs) using a standard seq2seq architecture: bidirectional GRU encoder, unidirectional GRU decoder, global dot-product attention. It can run in two modes—retrieve from a small hand-curated knowledge base first (100 Tencent Cloud Q&A pairs), or fall back to neural generation.

The interesting bit

The README’s “pitfall journal” is the real artifact. The author meticulously records every humbling discovery: model.to(device) doesn’t move member tensors defined in forward, batch size mysteriously affects convergence, and torch.long is not “high-precision float.” It’s a rare public record of the debugging density between tutorial comprehension and working code.

Key highlights

  • Pre-trained checkpoint included (chatbot_0509_1437); skip preprocessing if desired
  • Greedy search decoder implemented; beam search is stubbed as “to do”
  • Evaluation is qualitative only—“quantitative evaluation not yet written, should use perplexity”
  • Code follows class-based structure refactored from the official PyTorch chatbot tutorial
  • Chinese word segmentation via jieba; author notes quality issues from missing stopword filtering

Caveats

  • Author explicitly states the model “doesn’t work very well” and cites segmentation quality as a key bottleneck
  • Beam search unfinished; no quantitative metrics implemented
  • Knowledge base is tiny (100 entries) and domain-specific to Tencent Cloud services

Verdict

Worth browsing for the honest debugging notes if you’re a student bridging PyTorch tutorials to your first real NLP project. Skip if you need a production Chinese dialogue system or reproducible benchmarks.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.