A Chinese-language survival guide for NLP and knowledge graphs
Curated links, paper notes, and tool lists for developers navigating Chinese NLP, KG construction, and dialogue systems.

What it does
This repo is essentially a well-organized bookmark collection and reading list for Chinese-language NLP practitioners. It gathers links to papers, Baidu mind-map summaries, open-source QA systems, preprocessing tools, and conference rankings—covering everything from BERT and ERNIE to event-driven knowledge graphs and financial document extraction.
The interesting bit
The value is in the curation, not code. The maintainer has been assembling this since 2019, tracking the field’s shift from “data fusion knowledge” to “All in LLM.” For Chinese developers, it’s a rare centralized index of domestic tools (THULAC, HanLP, jiagu) alongside international staples, plus hard-to-find conference slides from Chinese industry players like Xiaomi and iFlytek.
Key highlights
- Paper summaries via Baidu Naotu mind maps for BERT, ERNIE, T5, and attention surveys
- Curated lists of Chinese text preprocessing tools (THULAC, LTP, HanLP, jiagu, Jieba)
- Industry case studies from CCKS conferences (Xiaomi voice interaction, iFlytek KG applications)
- Event-driven knowledge graph (事理图谱) resources, a less commonly covered topic in Western literature
- Conference ranking table (ACL, EMNLP, CIKM, SIGKDD, etc.) with Chinese academic classifications
Caveats
- Most content is external links and PDFs; original code or implementations are minimal
- Several sections reference a “SmartInteraction” dialogue platform that appears to be commented out or abandoned
- Link rot is a real risk given the reliance on third-party mind-map services and WeChat articles
Verdict
Worth a star if you’re a Chinese-speaking developer or researcher trying to map the NLP/KG landscape without drowning in scattered bookmarks. Skip it if you need runnable code or English-first documentation.