A Japanese IME that runs a local GPT-2 model
Karukan replaces rigid kana-kanji conversion rules with a local GPT-2 model so your keyboard can infer context and meaning in real time.

What it does Karukan is a Japanese input method for Linux and macOS built in Rust. It converts romaji to hiragana and then to kanji using a GPT-2-based model executed locally via llama.cpp. The project bundles frontends for fcitx5 on Linux and InputMethodKit on macOS, backed by a shared engine that tracks state, handles live rewriting, and memorizes your preferred conversions.
The interesting bit
Rather than leaning entirely on classical morphological analysis, it throws a local large language model at the problem of choosing the right kanji sequence. It also borrows Mozc’s candidate rewriter to generate variants like hex numbers and half-width katakana, and supports Slack-style :emoji triggers alongside phonetic kana input.
Key highlights
- Neural kana-kanji conversion via a local GPT-2 model running through llama.cpp
- Live conversion that updates candidates as you type without hitting space
- Context-aware suggestions that factor in surrounding text
- Learns from your selections and promotes them in future predictive lookup
- Candidate rewriter (from Mozc) auto-generates numeral variants, half-width katakana, and case changes; emoji lookup by kana or
:triggersyntax
Caveats
- First launch downloads the neural model from Hugging Face, so expect a slow initial startup; subsequent runs use the cached model.
Verdict Developers and writers who type Japanese on Linux or macOS and want an offline, learning-capable alternative to cloud-based or traditional rule-based IMEs should take a look. If your current Mozc or system setup already feels invisible, the first-launch download and local model overhead may not be worth the switch.
Frequently asked
- What is togatoga/karukan?
- Karukan replaces rigid kana-kanji conversion rules with a local GPT-2 model so your keyboard can infer context and meaning in real time.
- Is karukan open source?
- Yes — togatoga/karukan is open source, released under the Apache-2.0 license.
- What language is karukan written in?
- togatoga/karukan is primarily written in Rust.
- How popular is karukan?
- togatoga/karukan has 505 stars on GitHub.
- Where can I find karukan?
- togatoga/karukan is on GitHub at https://github.com/togatoga/karukan.