SLAM that learns what geometry looks like before it sees it
MASt3R-SLAM feeds pretrained 3D reconstruction priors into a real-time dense SLAM pipeline so the system isn't starting from scratch every frame.

What it does
MASt3R-SLAM is a real-time dense SLAM system that builds 3D maps from monocular video, live RealSense streams, or image folders. It accepts known camera intrinsics or runs without calibration, and ships with evaluation scripts for TUM-RGBD, 7-Scenes, EuRoC, and ETH3D.
The interesting bit
The twist is the “reconstruction prior.” Instead of bootstrapping structure from motion from zero, the system leans on MASt3R—a pretrained model that already knows how to hallucinate (and match) 3D points from pairs of images. That prior gets wired into the SLAM backend so the map isn’t just triangulated; it’s informed.
Key highlights
- Runs live on RealSense or offline on MP4s / image folders
- Supports calibrated and uncalibrated modes
- Evaluated on standard SLAM benchmarks (TUM, 7-Scenes, EuRoC, ETH3D)
- Built atop MASt3R, DROID-SLAM, and ModernGL
- CVPR 2025; code, checkpoints, and video are public
Caveats
- WSL users need a separate
windowsbranch because multiprocessing shared memory breaks - The authors note “minor differences” between this released multi-processing version and paper results
- All experiments ran on an RTX 4090; performance on other GPUs is unspecified
Verdict
Worth a look if you’re building or benchmarking dense SLAM and want to see whether foundation-model priors actually buy you robustness or just fancy point clouds. Skip if you need guaranteed reproducibility against the paper numbers or are on modest hardware.