Deep learning for when your subjects won't hold still
SLEAP tracks multiple animals through video without requiring them to cooperate, stay separate, or even face the camera.

What it does
SLEAP is a deep learning framework that locates body parts on animals in video—think eyes, paws, tails—then tracks those parts across frames even when animals overlap, occlude each other, or move unpredictably. It ships with a full GUI for labeling training data, active learning to reduce annotation drudgery, and both top-down and bottom-up pose estimation strategies. Training runs 15–60 minutes on a single GPU; inference hits 600+ FPS in batch mode or sub-10ms latency for real-time work.
The interesting bit
The project split its guts into two independent backends: sleap-nn handles training and inference pipelines, while sleap-io manages file formats. That separation means you can script batch processing or build custom pipelines without dragging the entire GUI along for the ride. The README also quietly notes remote training/inference support—useful if your lab has one shared GPU cluster and twelve biologists with laptops.
Key highlights
- Multi-animal tracking with top-down and bottom-up approaches
- Built-in GUI with active learning and proofreading workflows
- Fast training (15–60 min on single GPU) and inference (600+ FPS batch, <10ms real-time)
- Remote training/inference support for GPU-less machines
- Two decoupled backends (
sleap-nn,sleap-io) for programmatic use - Published in Nature Methods (2022), successor to the earlier LEAP tool
Caveats
- Python 3.14 is explicitly not supported yet; stick to 3.11–3.13
- The
uvinstall commands vary by OS and CUDA version—read carefully before copy-pasting
Verdict
Biologists, neuroscientists, and anyone studying animal behavior who needs quantitative movement data should grab this. Pure computer vision researchers looking for general human pose estimation should look elsewhere—SLEAP’s optimizations are animal-specific.