Yes — bab2min/Kiwi is an open-source project tracked on heatdrop.

What language is Kiwi written in?

bab2min/Kiwi is primarily written in C++.

bab2min/Kiwi has 756 stars on GitHub.

Where can I find Kiwi?

bab2min/Kiwi is on GitHub at https://github.com/bab2min/Kiwi.

bab2min/Kiwi

A Korean tokenizer that outruns its rivals and fixes your typos

Kiwi is a fast, open-source Korean morphological analyzer with built-in typo correction and bindings for nearly every language you might actually use.

★756 stars C++ Data Tooling

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

Kiwi segments Korean text into morphemes—nouns, verbs, particles, endings, and the rest—using the Sejong tag set. It claims ~87% accuracy on web text and ~94% on written text, and since version 0.13.0 it can auto-correct simple typos during analysis. The core is C++, but the project has accumulated wrappers for Python, Java, C#, Go, R, Rust, Flutter, WebAssembly, and even an Android AAR.

The interesting bit

The project ships its own lightweight language model for disambiguation, which is unusual for a “fast” tokenizer. The README shows benchmark charts suggesting it keeps pace with or outruns competitors while still resolving ambiguous splits. Multithreading is built into the library itself, not bolted on by wrappers.

Key highlights

Core library in C++17 with prebuilt binaries for Windows, Linux, macOS, Android, plus ARM64 and PPC64LE
Auto typo correction (0.13.0+) with eval data showing recovery on web_with_typos.txt
Sentence splitting and tokenization benchmarks published, with links to reproduce
Web demo at kiwi.bab2min.pe.kr for quick testing
Active CI across x86_64, ARM64, PPC64LE, and WASM

Caveats

Swift wrapper is “coming soon” as of the README
Model files live in Git LFS; clone without it and you will have a bad time
The typo-correction mode loads slower and uses ~2.5× the memory (693 MB vs 278 MB in the sample run)

Verdict

Worth a look if you process Korean text at scale and need speed without sacrificing accuracy. Skip it if you only need English tokenization or if you are allergic to downloading large model files.

Frequently asked

What is bab2min/Kiwi?: Kiwi is a fast, open-source Korean morphological analyzer with built-in typo correction and bindings for nearly every language you might actually use.
Is Kiwi open source?: Yes — bab2min/Kiwi is an open-source project tracked on heatdrop.
What language is Kiwi written in?: bab2min/Kiwi is primarily written in C++.
How popular is Kiwi?: bab2min/Kiwi has 756 stars on GitHub.
Where can I find Kiwi?: bab2min/Kiwi is on GitHub at https://github.com/bab2min/Kiwi.