Is mleap open source?

Yes — combust/mleap is open source, released under the Apache-2.0 license.

What language is mleap written in?

combust/mleap is primarily written in Scala.

How popular is mleap?

combust/mleap has 1.5k stars on GitHub.

Where can I find mleap?

combust/mleap is on GitHub at https://github.com/combust/mleap.

← all repositories

combust/mleap

Ditch the Spark cluster for model serving

MLeap serializes ML pipelines from Spark and Scikit-learn into a lightweight JVM runtime that runs without their heavy dependencies.

★1.5k stars Scala Inference · Serving ML Frameworks

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does MLeap is a Scala-based execution engine and serialization format for machine learning pipelines. You train models in Spark, PySpark, or Scikit-learn, export them to a portable Bundle.ML format, then run inference via the MLeap runtime—no SparkContext, no numpy, no pandas required.

The interesting bit The project treats “training environment ≠ serving environment” as a first-class concern. It provides parity tests to ensure Spark and MLeap transformers behave identically, and supports both JSON and Protobuf serialization. The Scikit-learn integration is notably glue-like: it wraps sklearn components with MLeap-specific pipeline classes to make them serializable.

Key highlights

Core runtime is JVM/Scala; supports Spark 4.0.1 down to 2.4.5, Python 3.9–3.13
Serializes to JSON or Protobuf; executes without Spark or sklearn dependencies
Custom transformers and data types can be implemented for use across all supported frameworks
Optional Spark transformer extensions beyond the default MLlib offerings
Extensive test coverage with full parity tests between Spark and MLeap pipelines

Caveats

The README’s “blazing fast speeds” claim is mentioned but not substantiated with benchmarks
Scikit-learn integration requires using MLeap’s wrapper classes (mleap.sklearn.pipeline.Pipeline, mlinit()) rather than native sklearn APIs
Version compatibility matrix is extensive but complex; Java 8 through 17, multiple Scala versions, and tight Spark version coupling

Verdict Worth a look if you’re serving Spark ML models and tired of dragging a cluster into production. Less compelling if you’re already happy with ONNX, TensorFlow Serving, or a pure-Python stack.

Frequently asked

What is combust/mleap?: MLeap serializes ML pipelines from Spark and Scikit-learn into a lightweight JVM runtime that runs without their heavy dependencies.
Is mleap open source?: Yes — combust/mleap is open source, released under the Apache-2.0 license.
What language is mleap written in?: combust/mleap is primarily written in Scala.
How popular is mleap?: combust/mleap has 1.5k stars on GitHub.
Where can I find mleap?: combust/mleap is on GitHub at https://github.com/combust/mleap.