← all repositories

google/budoux

BudouX is a standalone ML-powered line break organizer that prevents phrases from being split across lines.

1.6k stars Python Domain AppsLanguage Models
budoux
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

BudouX is a machine learning tool that organizes line breaks to prevent phrases from being split mid-word. It uses a small (~15KB) ML model and works standalone without external APIs. It supports Japanese, Chinese, and Thai languages out of the box, and can be trained for any language. The tool is available as a Python module, JavaScript library, and Java library.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.