DeepMind's video model, ported to PyTorch before PyTorch was cool
A 2017-era port of the I3D action-recognition architecture, frozen in PyTorch 0.3 amber with working weights and fine-tuning scripts.

What it does
This repo ports DeepMind’s I3D (Inflated 3D ConvNet) to PyTorch, providing pre-trained weights for RGB and optical-flow streams trained on ImageNet and Kinetics. It includes scripts to fine-tune on the Charades dataset or extract per-segment features as numpy arrays.
The interesting bit
The author didn’t just reimplement—he converted DeepMind’s original weights and verified they produce “identical results,” then fine-tuned to match the exact settings from the Charades 2017 challenge-winning entry. That’s unusually meticulous for a 2017 PyTorch port.
Key highlights
- Pre-trained models included:
rgb_imagenet.pt,flow_imagenet.pt, plus Charades fine-tuned variants (rgb_charades.pt,flow_charades.pt) extract_features.pyloads full videos and dumps segment-level features to diskcharades_dataset.pyhandles the tedious frame-and-flow preprocessing pipeline- Based on the Carreira & Zisserman paper that helped establish two-stream 3D CNNs for video
Caveats
- Explicitly written for PyTorch 0.3; 0.4+ “may cause issues” (understatement likely intended)
- Requires optical flow and RGB frames pre-extracted as images on disk—no end-to-end video ingestion
- No updates since ~2017; modern PyTorch and current CUDA versions are uncharted territory
Verdict
Grab this if you need legacy I3D weights in PyTorch for reproduction or feature extraction on older infrastructure. Skip it if you want a maintained, modern video model—look at PyTorchVideo or similar instead.