Is spacy-stanza open source?

Yes — explosion/spacy-stanza is open source, released under the MIT license.

What language is spacy-stanza written in?

explosion/spacy-stanza is primarily written in Python.

How popular is spacy-stanza?

explosion/spacy-stanza has 747 stars on GitHub.

Where can I find spacy-stanza?

explosion/spacy-stanza is on GitHub at https://github.com/explosion/spacy-stanza.

← all repositories

explosion/spacy-stanza

Stanford's NLP models, finally speaking spaCy

A compatibility wrapper that lets you drop Stanza's research-grade multilingual pipelines into spaCy's ecosystem without rewriting your code.

★747 stars Python Other AI

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

spacy-stanza is a bridge, not a model. It wraps Stanford’s Stanza library so its tokenization, tagging, parsing, and NER outputs populate standard spaCy Doc objects. You call spacy_stanza.load_pipeline("en") and get back something that behaves like any other spaCy nlp object — displaCy visualizations, custom components, nlp.pipe, the lot.

The interesting bit

The clever part is where the work happens: everything runs inside a custom StanzaTokenizer, which means Stanza’s full pipeline executes at tokenization time and stuffs all annotations (lemmas, dependencies, entities) into the Doc before downstream components even see it. It’s a bit of a hack, but it lets you bolt on spaCy-specific tools — say, an EntityRuler or text classifier — on top of Stanza’s outputs.

Key highlights

Supports 68+ languages with Stanza’s pretrained models; falls back to spaCy’s xx language class when spaCy lacks dedicated support
Full spaCy API compatibility: doc.ents, token.dep_, displacy, custom pipeline components, serialization via nlp.to_disk()
Stanza pipeline options (language packages, pretokenized input, GPU use) pass through as keyword arguments or spaCy config blocks
spaCy v3.x only; v2.x users must pin to spacy-stanza<0.3.0

Caveats

Serialization saves pipeline config but not Stanza model weights — you must re-download models separately via stanza.download()
Tokenization swap to spaCy’s own tokenizer is limited to English only
Stanza models are “very large” (README’s words, not mine), so this is not a lightweight deployment option

Verdict

Worth a look if you need Stanza’s multilingual accuracy or CoNLL-winning parsers but can’t abandon your spaCy-based tooling. Skip it if you’re building from scratch and don’t need both ecosystems; the wrapper adds friction and model bloat for no gain.

Frequently asked

What is explosion/spacy-stanza?: A compatibility wrapper that lets you drop Stanza's research-grade multilingual pipelines into spaCy's ecosystem without rewriting your code.
Is spacy-stanza open source?: Yes — explosion/spacy-stanza is open source, released under the MIT license.
What language is spacy-stanza written in?: explosion/spacy-stanza is primarily written in Python.
How popular is spacy-stanza?: explosion/spacy-stanza has 747 stars on GitHub.
Where can I find spacy-stanza?: explosion/spacy-stanza is on GitHub at https://github.com/explosion/spacy-stanza.