← all repositories
facebookresearch/map-anything

One transformer, twelve 3D tasks, zero patience required

MapAnything turns a grab bag of inputs—images, poses, depth, calibration—into metric 3D geometry with a single feed-forward pass.

3.5k stars Python Computer VisionDomain Apps
map-anything
Velocity · 7d
+13
★ / day
Trend
steady
star history

What it does MapAnything is a research framework from Meta and CMU that trains one transformer end-to-end to regress metric 3D scene geometry. Feed it images alone, or mix in intrinsics, depth maps, or camera poses; it returns point clouds, depths, camera poses, and confidence masks in one shot. The framework wraps the full pipeline—data processing, training, inference, benchmarking, and profiling—in a modular Python package.

The interesting bit The “universal” claim is backed by a unified interface that can swap in other reconstruction models (VGGT, DUSt3R, MASt3R, Pi3-X, etc.) without rewriting your pipeline. It also exports directly to COLMAP and Gaussian Splatting, which saves the usual ritual of format conversion.

Key highlights

  • Single model handles 12+ tasks: SfM, MVS, monocular depth, depth completion, registration, and others
  • Memory-efficient inference mode claims up to 2,000 views on 140 GB VRAM with “negligible” speed trade-off (see their profiling section)
  • Supports mixed inputs across views: one image might have only RGB, another RGB+depth+pose
  • Apache 2.0 licensed model variant available on Hugging Face alongside the default release
  • Built-in benchmarking and profiling tools to compare against external models

Caveats

  • Camera poses must use the OpenCV cam2world convention; the README mentions a conversion helper but marks it TODO
  • PyTorch and CUDA versions are unpinned, so environment setup is left to you
  • The “negligible” speed/memory trade-off claim is theirs; verify against your own hardware

Verdict Worth a look if you’re building 3D reconstruction pipelines and want one model that handles messy, heterogeneous inputs without hand-tuning per-task pipelines. Skip it if you need guaranteed production robustness—this is explicitly framed as research code.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.