← all repositories
sebp/scikit-survival

Scikit-learn's missing piece for time-to-event data

A Python library that plugs survival analysis into the scikit-learn ecosystem you already know.

1.3k stars Python Domain AppsML Frameworks
scikit-survival
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

scikit-survival adds survival analysis models to scikit-learn’s familiar API. It handles the core statistical challenge of this domain: censored data — cases where you only know an event didn’t happen during observation, not whether it happened later. The library lets you use standard scikit-learn machinery for preprocessing, cross-validation, and model selection while fitting time-to-event predictions.

The interesting bit

The real value isn’t the models themselves — it’s the integration. Survival analysis usually lives in separate statistical packages with their own conventions. By building on scikit-learn, this library lets you drop survival models into existing ML pipelines without learning a second ecosystem.

Key highlights

  • Implements survival models compatible with scikit-learn’s fit/predict patterns
  • Handles right-censored data, the standard partial-observation case in clinical and reliability studies
  • Works with scikit-learn preprocessing and cross-validation utilities
  • Published in JMLR (2020), suggesting academic credibility
  • Available via conda-forge and PyPI; requires Python 3.11+, numpy 2.0+, pandas 2.2+, scikit-learn 1.8

Caveats

  • Requires a C/C++ compiler for installation from source
  • GPL v3 license may complicate commercial use
  • README is light on which specific survival models are implemented; check the user guide for details

Verdict

Worth a look if you’re already in scikit-learn and need to model time-to-event outcomes with censored observations. Skip it if you need a deep statistical toolkit with frequentist inference and built-in plotting — this is ML-flavored survival analysis, not a replacement for R’s survival package.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.