← all repositories
piergiaj/pytorch-i3d

DeepMind's video model, ported to PyTorch before PyTorch was cool

A 2017-era port of the I3D action-recognition architecture, frozen in PyTorch 0.3 amber with working weights and fine-tuning scripts.

1.1k stars Python Computer VisionML Frameworks
pytorch-i3d
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

This repo ports DeepMind’s I3D (Inflated 3D ConvNet) to PyTorch, providing pre-trained weights for RGB and optical-flow streams trained on ImageNet and Kinetics. It includes scripts to fine-tune on the Charades dataset or extract per-segment features as numpy arrays.

The interesting bit

The author didn’t just reimplement—he converted DeepMind’s original weights and verified they produce “identical results,” then fine-tuned to match the exact settings from the Charades 2017 challenge-winning entry. That’s unusually meticulous for a 2017 PyTorch port.

Key highlights

  • Pre-trained models included: rgb_imagenet.pt, flow_imagenet.pt, plus Charades fine-tuned variants (rgb_charades.pt, flow_charades.pt)
  • extract_features.py loads full videos and dumps segment-level features to disk
  • charades_dataset.py handles the tedious frame-and-flow preprocessing pipeline
  • Based on the Carreira & Zisserman paper that helped establish two-stream 3D CNNs for video

Caveats

  • Explicitly written for PyTorch 0.3; 0.4+ “may cause issues” (understatement likely intended)
  • Requires optical flow and RGB frames pre-extracted as images on disk—no end-to-end video ingestion
  • No updates since ~2017; modern PyTorch and current CUDA versions are uncharted territory

Verdict

Grab this if you need legacy I3D weights in PyTorch for reproduction or feature extraction on older infrastructure. Skip it if you want a maintained, modern video model—look at PyTorchVideo or similar instead.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.