Is spark-nlp-workshop open source?

Yes — JohnSnowLabs/spark-nlp-workshop is open source, released under the Apache-2.0 license.

What language is spark-nlp-workshop written in?

JohnSnowLabs/spark-nlp-workshop is primarily written in Jupyter Notebook.

How popular is spark-nlp-workshop?

JohnSnowLabs/spark-nlp-workshop has 1.1k stars on GitHub.

Where can I find spark-nlp-workshop?

JohnSnowLabs/spark-nlp-workshop is on GitHub at https://github.com/JohnSnowLabs/spark-nlp-workshop.

← all repositories

JohnSnowLabs/spark-nlp-workshop

1,088 stars, zero hype: a Spark NLP cookbook that just works

A sprawling repo of runnable notebooks for the Spark NLP ecosystem, from annotation to training to Databricks.

★1.1k stars Jupyter Notebook ML Frameworks Language Models Learning

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does This is the official companion kitchen for John Snow Labs’ Spark NLP library: Jupyter notebooks, Colab-ready tutorials, and Databricks notebooks covering annotation pipelines, model training, and evaluation. The setup instructions are refreshingly explicit — Java 8, PySpark 3.1.2, pip install, done. There’s even a one-liner shell script for Colab that downloads and configures everything behind the scenes.

The interesting bit The “old_generation_notebooks” folder in tutorials suggests this repo has been through enough iterations to accumulate historical baggage, yet someone is still maintaining backward compatibility. That’s either admirable diligence or a warning about API churn — the README doesn’t clarify which.

Key highlights

Python and Scala examples side by side (rare in notebook-land)
Dedicated Databricks folder for enterprise Spark deployments
One-shot Colab setup via wget | bash — convenient, if you trust it
Explicit dependency pinning (PySpark 3.1.2) rather than “latest and hope”
Apache 2.0 licensed, with a Slack community linked for support

Caveats

The “evalulation” typo in the table of contents has survived at least one README revision
No topics tagged on GitHub, making discovery harder than it should be
“Old generation” notebooks are still prominently linked; unclear if they’re deprecated or merely archived

Verdict Grab this if you’re already committed to Spark NLP and need working starter code. Skip it if you’re looking for a general NLP tutorial — the Spark dependency and JVM tooling make this a niche affair.

Frequently asked

What is JohnSnowLabs/spark-nlp-workshop?: A sprawling repo of runnable notebooks for the Spark NLP ecosystem, from annotation to training to Databricks.
Is spark-nlp-workshop open source?: Yes — JohnSnowLabs/spark-nlp-workshop is open source, released under the Apache-2.0 license.
What language is spark-nlp-workshop written in?: JohnSnowLabs/spark-nlp-workshop is primarily written in Jupyter Notebook.
How popular is spark-nlp-workshop?: JohnSnowLabs/spark-nlp-workshop has 1.1k stars on GitHub.
Where can I find spark-nlp-workshop?: JohnSnowLabs/spark-nlp-workshop is on GitHub at https://github.com/JohnSnowLabs/spark-nlp-workshop.