Is mynlp open source?

Yes — mayabot/mynlp is open source, released under the Apache-2.0 license.

What language is mynlp written in?

mayabot/mynlp is primarily written in Java.

How popular is mynlp?

mayabot/mynlp has 688 stars on GitHub.

Where can I find mynlp?

mayabot/mynlp is on GitHub at https://github.com/mayabot/mynlp.

← all repositories

mayabot/mynlp

Java's answer to "just give me Chinese NLP that works"

A modular, Maven-friendly toolkit that ships perception-based segmentation, NER, pinyin, and BM25 without dragging in Python's ecosystem.

★688 stars Java ML Frameworks Data Tooling

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Mynlp is a Java-native Chinese NLP toolkit built for production use. It covers the standard bases—word segmentation, part-of-speech tagging, named entity recognition, pinyin conversion, traditional/simplified Chinese conversion, and BM25 scoring—packaged as discrete Maven modules so you pull only what you need.

The interesting bit

The resource-splitting is unusually sane. Core dictionaries and models (some 60MB+) live in separate artifacts, not bundled into the main JAR. You can opt for the “lazy” mynlp-all convenience package or cherry-pick resources à la carte—useful if you’re counting megabytes or avoiding unused model bloat in containers.

Key highlights

Perceptron-based segmentation and tagging (not purely dictionary-driven)
fastText and StarSpace integration for word/label representations
Custom dictionary support with correction capabilities
New word discovery and person-name recognition as built-in modules
Acknowledged lineage from HanLP and ansj_seg—borrows proven algorithms rather than reinventing them quietly

Caveats

Documentation and community presence (QQ group, Chinese-language docs) assume Chinese fluency; English support appears minimal
690 stars suggests modest adoption outside its target ecosystem; battle-testing at scale is unclear from the README alone

Verdict

Worth a look if you’re running JVM-based services and need Chinese text processing without bridging to Python. Skip it if your pipeline is already invested in HanLP’s newer iterations or if you need extensive multilingual support.

Frequently asked

What is mayabot/mynlp?: A modular, Maven-friendly toolkit that ships perception-based segmentation, NER, pinyin, and BM25 without dragging in Python's ecosystem.
Is mynlp open source?: Yes — mayabot/mynlp is open source, released under the Apache-2.0 license.
What language is mynlp written in?: mayabot/mynlp is primarily written in Java.
How popular is mynlp?: mayabot/mynlp has 688 stars on GitHub.
Where can I find mynlp?: mayabot/mynlp is on GitHub at https://github.com/mayabot/mynlp.