← all repositories
hemingkx/ChineseNMT

A Transformer that speaks Chinese, warts and all

A straightforward PyTorch implementation of English-to-Chinese NMT that shows its work—and its BLEU scores.

500 stars Python Language Models
ChineseNMT
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

ChineseNMT trains a standard Transformer to translate English news text into Chinese. It uses SentencePiece for bilingual tokenization, pulls data from WMT 2018, and wraps the well-known Harvard transformer-pytorch reference implementation. The repo includes training scripts, a pretrained model, and a single-sentence inference mode.

The interesting bit

The author doesn’t just dump code—they expose the plumbing. You can watch NoamOpt and label smoothing duke it out in a tidy ablation table (NoamOpt wins, label smoothing loses), and the beam search sweep goes all the way to size 5 for a 0.27 BLEU gain. It’s the kind of obsessive thoroughness that makes a tutorial repo actually useful for reproduction.

Key highlights

  • Pretrained model available via Baidu Pan (password: g9wl) — no retraining required to experiment
  • Multi-GPU support via CUDA_VISIBLE_DEVICES and a device_id list in config
  • Single-sentence translation mode with a concrete example comparing ground truth against beam-3 output
  • Windows compatibility notes contributed by the community in issue #2
  • Trained and tested on dual GTX 1080 Ti cards; ~1 hour per epoch

Caveats

  • Linux-only by default; Windows requires manual UTF-8 encoding fixes
  • Pretrained model hosted on Baidu Pan, which may be inaccessible outside China
  • The “best” model only hits 25.94 BLEU on test — respectable but not competitive with modern systems

Verdict

Grab this if you’re a Chinese-speaking student who wants to see a Transformer built from recognizable pieces, or if you need a baseline to beat. Skip it if you need production-grade translation or SOTA results; this is a learning scaffold, not a product.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.