Turning news articles into knowledge graphs, the hard way
A Chinese NLP project that extracts entities and relationships from text and renders them as interactive HTML graphs.

What it does
TextGrapher takes a Chinese news article, runs it through keyword extraction, named entity recognition, and subject-verb-object parsing, then dumps the results into a force-directed graph saved as graph.html. The API is two lines: instantiate CrimeMining(), call .main(content). The examples lean heavily on criminal cases and corporate scandals—ZTE, the Wei Zexi medical fraud case, a campus murder.
The interesting bit
The author is upfront that this is an attempt at a genuinely hard problem: how to represent document semantics in a structured, glanceable form. The pipeline fuses three extraction layers—frequency, entities, and syntactic triples—rather than betting on any single technique. That frankness is refreshing in a field prone to hand-waving.
Key highlights
- Ships with working examples on real Chinese news events (see screenshots)
- Combines keyword, NER, and SVO extraction into one graph view
- Output is a self-contained HTML file—no frontend build step
- ~1,500 stars suggests the idea resonates, even if the implementation is rough
Caveats
- The README explicitly warns that “NLP performance limits” create “multiple deficiencies” in extraction quality
- Class name
CrimeMining()suggests the code may be narrowly tuned to crime/corporate scandal text; unclear how it generalizes - No mention of model versions, dependencies, or installation instructions
Verdict
Worth a look if you’re prototyping Chinese text-to-graph pipelines and need a baseline to beat. Skip it if you need production-grade accuracy or English-language support—the author doesn’t claim either.