Scikit-learn's missing piece for time-to-event data
A Python library that plugs survival analysis into the scikit-learn ecosystem you already know.

What it does
scikit-survival adds survival analysis models to scikit-learn’s familiar API. It handles the core statistical challenge of this domain: censored data — cases where you only know an event didn’t happen during observation, not whether it happened later. The library lets you use standard scikit-learn machinery for preprocessing, cross-validation, and model selection while fitting time-to-event predictions.
The interesting bit
The real value isn’t the models themselves — it’s the integration. Survival analysis usually lives in separate statistical packages with their own conventions. By building on scikit-learn, this library lets you drop survival models into existing ML pipelines without learning a second ecosystem.
Key highlights
- Implements survival models compatible with scikit-learn’s
fit/predictpatterns - Handles right-censored data, the standard partial-observation case in clinical and reliability studies
- Works with scikit-learn preprocessing and cross-validation utilities
- Published in JMLR (2020), suggesting academic credibility
- Available via conda-forge and PyPI; requires Python 3.11+, numpy 2.0+, pandas 2.2+, scikit-learn 1.8
Caveats
- Requires a C/C++ compiler for installation from source
- GPL v3 license may complicate commercial use
- README is light on which specific survival models are implemented; check the user guide for details
Verdict
Worth a look if you’re already in scikit-learn and need to model time-to-event outcomes with censored observations. Skip it if you need a deep statistical toolkit with frequentist inference and built-in plotting — this is ML-flavored survival analysis, not a replacement for R’s survival package.