MITIE: A no-strings-attached NER toolkit from MIT
Free, commercial-friendly named entity extraction and relation detection with pre-trained models for English, Spanish, and German.

What it does MITIE extracts named entities (people, places, organizations) and detects binary relations between them from raw text. It ships with pre-trained models for English, Spanish, and German, plus tools to train your own extractors. The core is C++, but bindings cover Python, R, Java, C, MATLAB, and a small ecosystem of third-party wrappers for OCaml, .NET, PHP, and Ruby.
The interesting bit The authors openly admit MITIE is “basically just a thin wrapper around dlib” — a refreshing dose of honesty in academic software. The actual heavy lifting comes from dlib’s machine learning toolkit, combined with distributional word embeddings and Structural SVMs. The value is in the packaging: pre-trained models built on CoNLL 2003, ACE, Wikipedia, Freebase, and Gigaword, ready to run from a command-line pipe or your language of choice.
Key highlights
- Boost Software License: genuinely free, including commercial use
- Pre-trained NER models for three languages; relation detection included
- Python support spans 2.7 through 3.8+ using only standard library
ctypes - Command-line streaming tool (
ner_stream) for quick text markup - CMake and make builds, with optional OpenBLAS acceleration
Caveats
- Model downloads are manual and split by language — no single “install and go” package
- Java bindings require SWIG, CMake, and careful 64-bit Windows handling
- The README’s “state-of-the-art” claim links to a wiki evaluation page, not inline numbers
Verdict Worth a look if you need a permissively-licensed, self-hosted NER solution without the dependency bloat of modern neural toolkits. Skip it if you want SOTA transformer-based accuracy or a batteries-included pip install.