← all repositories

nlp-uoregon/trankit

A multilingual NLP toolkit using transformer models to perform tokenization, parsing, and tagging across 56 languages.

794 stars Python ML FrameworksLanguage Models
trankit
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

Trankit is a Python toolkit for multilingual natural language processing built on PyTorch and transformer architectures. It provides pre-trained pipelines for tasks including sentence segmentation, tokenization, part-of-speech tagging, morphological tagging, lemmatization, and dependency parsing. The toolkit supports 56 languages using XLM-RoBERTa-based models and offers both command-line and Python API interfaces.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.