A Transformer that speaks Chinese, warts and all
A straightforward PyTorch implementation of English-to-Chinese NMT that shows its work—and its BLEU scores.

What it does
ChineseNMT trains a standard Transformer to translate English news text into Chinese. It uses SentencePiece for bilingual tokenization, pulls data from WMT 2018, and wraps the well-known Harvard transformer-pytorch reference implementation. The repo includes training scripts, a pretrained model, and a single-sentence inference mode.
The interesting bit
The author doesn’t just dump code—they expose the plumbing. You can watch NoamOpt and label smoothing duke it out in a tidy ablation table (NoamOpt wins, label smoothing loses), and the beam search sweep goes all the way to size 5 for a 0.27 BLEU gain. It’s the kind of obsessive thoroughness that makes a tutorial repo actually useful for reproduction.
Key highlights
- Pretrained model available via Baidu Pan (password: g9wl) — no retraining required to experiment
- Multi-GPU support via
CUDA_VISIBLE_DEVICESand adevice_idlist in config - Single-sentence translation mode with a concrete example comparing ground truth against beam-3 output
- Windows compatibility notes contributed by the community in issue #2
- Trained and tested on dual GTX 1080 Ti cards; ~1 hour per epoch
Caveats
- Linux-only by default; Windows requires manual UTF-8 encoding fixes
- Pretrained model hosted on Baidu Pan, which may be inaccessible outside China
- The “best” model only hits 25.94 BLEU on test — respectable but not competitive with modern systems
Verdict
Grab this if you’re a Chinese-speaking student who wants to see a Transformer built from recognizable pieces, or if you need a baseline to beat. Skip it if you need production-grade translation or SOTA results; this is a learning scaffold, not a product.