Satellite imagery ML without the GDAL headaches
A Python framework that turns Copernicus and Landsat data into NumPy arrays your ML pipeline can actually digest.

What it does
eo-learn is a modular Python toolkit for building Earth-observation workflows. It fetches spatio-temporal satellite data (Sentinel, Landsat, etc.), handles cloud masking, co-registration, and feature extraction, and hands off clean NumPy arrays to standard ML tools. The core abstractions are EOPatch (a container for EO data), EOTask (a single processing step), and EOWorkflow (the chain that runs them).
The interesting bit
The project deliberately bridges two cultures: remote-sensing experts get Python’s ML ecosystem, while data scientists get shielded from the gnarly details of satellite data formats. It also leans on Sentinel Hub’s services for data access, so you’re not managing terabytes of raw imagery locally.
Key highlights
- Modular task library: cloud masking, geometry ops, I/O, visualization, and ML utilities are separate installable modules
- Optional extras keep the base install light —
pip install "eo-learn[EXTRA]"pulls in cloud detection, interpolation, etc. - Docker images with Jupyter included, plus a separate
latest-examplesimage with notebooks and sample data - Python ≥3.8; requires system-level GDAL/PROJ libraries on Linux and Mac (Windows needs unofficial wheels)
- Active real-world use: World Bank, agricultural monitoring, land-cover classification tutorials
Caveats
- Some community-contributed tasks were moved to a separate
eo-learn-examplesrepo to keep maintenance manageable; the README notes these may not be up-to-date - Windows installation is explicitly more painful (unofficial wheel repository for GDAL, rasterio, shapely, fiona)
Verdict
Worth a look if you’re doing applied ML on satellite time-series and want Python-native tooling. Skip it if you’re already happy with Google Earth Engine’s JavaScript API or need a fully managed cloud solution.