A Swiss-Army knife for Chinese text that fits in one import
xmnlp bundles a dozen Chinese NLP tasks—segmentation, NER, sentiment, pinyin, even radicals—behind a single pip install, with ONNX models you download separately.

What it does
xmnlp is an all-in-one Chinese NLP toolkit. It handles word segmentation, part-of-speech tagging, named-entity recognition, sentiment analysis, text correction, keyword/keyphrase extraction, pinyin conversion, and even Chinese character radical lookup. Most heavy lifting runs through RoBERTa + CRF models exported to ONNX, with faster rule-based fallbacks (reverse maximum matching) when you don’t need neural precision.
The interesting bit
The “speed vs. accuracy” dial is explicit: every major task exposes both fast_* and deep_* variants, so you can trade neural nuance for throughput without swapping libraries. The radical lookup and pinyin features are just HashMap and Trie lookups—simple, but oddly hard to find bundled with modern transformer-based tools.
Key highlights
- Segmentation, POS tagging, and NER via RoBERTa + CRF finetuning, with custom dictionary support (jieba-compatible format)
- Sentiment analysis and spell-checking (detector + corrector) included
- Keyword/keyphrase extraction via Textrank
- Sentence embeddings and similarity calculation
- ONNX Runtime inference; supports Python 3.6–3.8 on Linux, Windows, macOS
- Models downloaded separately via Feishu or Baidu Netdisk—version-locked to the package
Caveats
- Deep model interfaces are Simplified-Chinese only; no Traditional Chinese support
- Model weights are hosted on Chinese cloud services (Feishu/Baidu), not HuggingFace or GitHub releases
- Python 3.6–3.8 support suggests the project may not be actively tracking newer releases
Verdict
Good fit if you need one library to cover the full Chinese NLP pipeline without orchestrating multiple dependencies. Skip it if you require Traditional Chinese, want models pip-installable from PyPI, or need the bleeding-edge accuracy of dedicated single-task libraries.