← all repositories

jacksonllee/pycantonese

A Python library for Cantonese linguistics and natural language processing tasks including word segmentation and POS tagging.

407 stars Python Data ToolingML Frameworks
pycantonese
Velocity · 7d
+0.1
★ / day
Trend
steady
star history

PyCantonese provides tools for accessing and searching Cantonese corpus data, parsing Jyutping romanization, performing word segmentation, and part-of-speech tagging. It focuses on ease of use and linguistic knowledge, serving both academic and commercial users. Since v4.0.0, it depends on Rustling for efficient text processing operations.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.