← all repositories
zju3dv/InfiniDepth

Depth maps that scale like vector graphics, not bitmaps

A CVPR 2026 project turns single RGB images into resolution-independent depth using neural implicit fields, then goes further into 3D Gaussians and sensor fusion.

1k stars Python Computer Vision
InfiniDepth
Velocity · 7d
+6.6
★ / day
Trend
steady
star history

What it does InfiniDepth estimates depth from a single RGB image at any resolution you ask for—upsample, original, or a specific size—rather than being locked to the network’s training resolution. It can also spit out 3D Gaussian Splatting scenes, novel-view orbit videos, and point clouds. If you have a depth sensor, a separate mode fuses that sparse metric data to produce metric depth and aligned 3D Gaussians.

The interesting bit The “arbitrary-resolution” claim is the hook: most depth networks output a fixed grid, but InfiniDepth uses neural implicit fields to query depth continuously, like sampling a signed distance function at whatever density you need. The repo also bundles multi-view/video processing that aligns per-frame predictions into a global point cloud, optionally using Depth Anything 3 for sequence-level consistency.

Key highlights

  • Three inference modes: RGB-only relative depth, RGB + depth sensor metric depth, and multi-view/video with global alignment
  • Outputs depth maps, point clouds (.ply), 3D Gaussian scenes, and optional novel-view orbit/swing videos
  • Gradio demo included; hosted Hugging Face space available for testing before local install
  • Supports multiple sparse depth formats: .png, .npy, .npz, .h5, .hdf5, .exr
  • Training and evaluation code released as of April 2026; inference code arrived March 2026

Caveats

  • The README is thorough on inference but says nothing about training data, compute requirements, or quantitative benchmarks against prior work
  • Multi-view mode depends on Depth Anything 3 (DA3-LARGE-1.1) for default sequence alignment, adding a heavy external dependency
  • Camera intrinsics (fx_org, fy_org, cx_org, cy_org) are “strongly recommended” for sensor fusion mode; the fallback behavior is unspecified

Verdict Worth a look if you need depth or 3D Gaussian exports at non-standard resolutions, or if you’re sitting on RGB+LiDAR data that needs densifying. Skip it if you want a lightweight drop-in replacement for MiDaS—this is a research codebase with multiple model checkpoints and a nontrivial setup.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.