TensorFlow 1.4 NER: A time capsule from 2017's Chinese NLP frontier
A straightforward BiRNN+CRF implementation for Chinese named-entity recognition that shows its age in the details.

What it does
Trains a bidirectional RNN with CRF layer to tag named entities in Chinese text. You bring your own word2vec-style embeddings (colon-separated, 300-dim by default), feed it source/target files, and toggle between train and predict modes by editing a flag in config.py.
The interesting bit
The README makes a small fuss about using TensorFlow’s then-new Dataset API for “elegant” data feeding — a telling snapshot of what counted as modern infrastructure in the TF 1.x era. The architecture itself (BiRNN → CRF) was the standard playbook; the value was in having a working, documented Chinese implementation.
Key highlights
- Requires TensorFlow ≥ 1.4.1; explicitly unsure about 1.2 compatibility after upgrades
- Needs manual word segmentation and custom-trained embeddings (gensim/glove acceptable)
- Configuration lives entirely in
config.pyviatf.app.flags— no CLI args - Resource folder contains format examples; author offers to share pre-trained vectors on request
- ~1,000 stars suggests it filled a gap for Chinese-speaking practitioners at the time
Caveats
- TensorFlow 1.4 is long dead; running this today requires archaeology skills or Docker time travel
- No mention of model performance metrics, supported entity types, or benchmark datasets
- “Contact me for pre-trained vectors” is a single point of failure for reproducibility
Verdict
Worth studying if you’re tracing the evolution of Chinese NLP tooling or need to resurrect a legacy pipeline. Everyone else should look to modern frameworks (spaCy, Stanza, or transformers-based approaches) unless they specifically need this architecture for comparison.