JohnSnowLabs/spark-nlp
A distributed natural language processing library built on Apache Spark with support for LLMs and 200+ languages.

Velocity · 7d
+1.3
★ / day
Trend
→steady
star history
Spark NLP is a production-grade NLP library that runs on Apache Spark for distributed processing. It offers over 100,000 pretrained pipelines and models covering tasks such as named entity recognition, sentiment analysis, question answering, machine translation, and spell checking across more than 200 languages. The library integrates with transformers, TensorFlow, and ONNX for model serving, and supports LLMs through llama.cpp integration.