← all repositories
isarandi/metrabs

3D pose from a single photo, no mocap suit required

MeTRAbs turns an RGB image into metric-scale 3D human poses, handling partial bodies and lens distortion without breaking stride.

metrabs
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

MeTRAbs estimates absolute 3D human poses from ordinary RGB images—no depth sensor, no calibrated multi-camera rig. Feed it a photo or video frame and it returns 2D keypoints, 3D poses in camera space, and optionally 3D world coordinates if you provide camera calibration. The models run as standalone TensorFlow SavedModels, so one tfhub.load() call gets you inference without dragging in the entire training codebase.

The interesting bit

The “truncation-robust” part matters: the model doesn’t fall apart when limbs are cropped or partially out of frame, a common failure mode in pose estimators. It also undoes radial/tangential lens distortion on the GPU and applies gamma-correct rescaling—details that usually get hand-waved but directly affect accuracy on real-world footage.

Key highlights

  • Single-line inference via TensorFlow Hub; experimental PyTorch support added in 2023
  • Multiple skeleton formats (COCO, SMPL, H36M) selectable at runtime
  • Built-in test-time augmentation and plausibility filtering to suppress weird poses
  • Backbone options from EfficientNetV2 (accurate) to MobileNetV3 (fast)
  • Won the 3DPW Challenge; code and models upgraded to TensorFlow 2 with ongoing maintenance

Caveats

  • Models are non-commercial only due to training dataset licenses
  • The PyTorch port is labeled “experimental” in the README
  • Multi-dataset training details and full evaluation scripts require digging into the docs/ directory

Verdict

Worth a look if you need 3D pose from monocular video without building a capture studio. Skip it if your use case is commercial or if you need real-time performance guarantees—the speed depends heavily on which backbone you choose.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.