nlp-uoregon/trankit
A multilingual NLP toolkit using transformer models to perform tokenization, parsing, and tagging across 56 languages.

Velocity · 7d
+0.4
★ / day
Trend
→steady
star history
Trankit is a Python toolkit for multilingual natural language processing built on PyTorch and transformer architectures. It provides pre-trained pipelines for tasks including sentence segmentation, tokenization, part-of-speech tagging, morphological tagging, lemmatization, and dependency parsing. The toolkit supports 56 languages using XLM-RoBERTa-based models and offers both command-line and Python API interfaces.