Is Awesome-Korean-NLP open source?

Yes — datanada/Awesome-Korean-NLP is an open-source project tracked on heatdrop.

How popular is Awesome-Korean-NLP?

datanada/Awesome-Korean-NLP has 661 stars on GitHub.

Where can I find Awesome-Korean-NLP?

datanada/Awesome-Korean-NLP is on GitHub at https://github.com/datanada/Awesome-Korean-NLP.

datanada/Awesome-Korean-NLP

A field guide to not getting lost in Korean NLP

A curated list of tools, datasets, and papers for processing Korean text, because agglutinative morphology doesn't solve itself.

★661 stars Learning Data Tooling

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

This is a curated awesome-list that catalogs resources for Korean-language NLP: morphological analyzers, datasets like Sejong and NamuWiki dumps, papers, lectures, and community links. It covers both Korean-specific tools (Hannanum, Kkma, Komoran, Mecab-ko) and language-agnostic packages with Korean bindings (KoNLPy, FastText, gensim).

The interesting bit

The list explicitly splits between “NLP of Korean text” and “NLP information written in Korean” — a useful distinction if you’re hunting for tools versus hunting for tutorials you can actually read. The maintainer also keeps a live collabedit link for casual contributions, which feels charmingly retro.

Key highlights

Morpheme analyzers: 12+ options including Java stalwarts (Hannanum, Kkma), C++ workhorses (Mecab-ko), and newer entrants (Rouzeta, seunjeon)
Datasets: Government corpora (Sejong, KAIST), web dumps (Wikipedia, NamuWiki), and sentiment-labeled data (Naver movie corpus)
Bindings matter: KoNLPy wraps multiple Java analyzers for Python; kroman ports Hangul romanization across five languages
Community links: Korean-language NLP conferences since 1989, plus active Facebook groups (Tensorflow KR, AI Korea)
Odd gems: A crowdsourced Korean profanity dictionary and a TextRank summarizer demo running on Heroku

Caveats

Several paper links are dead (marked with strikethrough), and the English papers section is empty
Some tool links point to Korean-only pages or SourceForge projects that may be unmaintained
The “collabedit” contribution method suggests the list may not see frequent structured updates

Verdict

Worth bookmarking if you’re doing Korean NLP and tired of re-discovering that Mecab-ko exists. Skip it if you need actively maintained, benchmarked comparisons — this is a directory, not a review site.

Frequently asked

What is datanada/Awesome-Korean-NLP?: A curated list of tools, datasets, and papers for processing Korean text, because agglutinative morphology doesn't solve itself.
Is Awesome-Korean-NLP open source?: Yes — datanada/Awesome-Korean-NLP is an open-source project tracked on heatdrop.
How popular is Awesome-Korean-NLP?: datanada/Awesome-Korean-NLP has 661 stars on GitHub.
Where can I find Awesome-Korean-NLP?: datanada/Awesome-Korean-NLP is on GitHub at https://github.com/datanada/Awesome-Korean-NLP.