Is video-classification-3d-cnn-pytorch open source?

Yes — kenshohara/video-classification-3d-cnn-pytorch is open source, released under the MIT license.

What language is video-classification-3d-cnn-pytorch written in?

kenshohara/video-classification-3d-cnn-pytorch is primarily written in Python.

How popular is video-classification-3d-cnn-pytorch?

kenshohara/video-classification-3d-cnn-pytorch has 1.1k stars on GitHub.

Where can I find video-classification-3d-cnn-pytorch?

kenshohara/video-classification-3d-cnn-pytorch is on GitHub at https://github.com/kenshohara/video-classification-3d-cnn-pytorch.

← all repositories

kenshohara/video-classification-3d-cnn-pytorch

Pretrained 3D ResNet: drop in a video, get action labels

A straightforward inference wrapper for spatiotemporal CNNs trained on 400 human actions.

★1.1k stars Python Computer Vision ML Frameworks

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does Feed it a video folder and a pretrained 3D ResNet (ResNet-34 or ResNeXt-101) and it spits out JSON: either class scores across 400 Kinetics action categories, or 512-dim feature vectors, computed every 16 frames. There’s also a small visualization script to overlay predictions back onto the source video.

The interesting bit The project is essentially a clean inference harness around the author’s earlier training codebase. The value isn’t novelty—it’s convenience. You don’t retrain; you download weights, point at ~/videos, and run. The 2017 paper’s question—“Can 3D CNNs retrace 2D CNNs’ history?"—is answered here with a pragmatic “yes, and here’s the tool.”

Key highlights

Pretrained on Kinetics-400 (400 action classes)
Two modes: score (class predictions) or feature (512-dim embeddings post-global average pooling)
Supports ResNeXt-101, which the authors note performed best
Includes a result visualization script
Companion Lua/Torch version exists for the historically inclined

Caveats

Setup instructions reference PyTorch 0.x-era conda channels (soumith, cuda80) and FFmpeg 3.3.3; expect to adapt for modern environments
The README is sparse on input format specifics—resolution, codec compatibility, exact JSON schema are left unstated
No mention of GPU memory requirements or batching behavior for long videos

Verdict Useful if you need quick, off-the-shelf action recognition or video feature extraction without building a pipeline from scratch. Skip if you need fine-grained temporal modeling, custom classes, or production-grade robustness—this is research code with research-code edges.

Frequently asked

What is kenshohara/video-classification-3d-cnn-pytorch?: A straightforward inference wrapper for spatiotemporal CNNs trained on 400 human actions.
Is video-classification-3d-cnn-pytorch open source?: Yes — kenshohara/video-classification-3d-cnn-pytorch is open source, released under the MIT license.
What language is video-classification-3d-cnn-pytorch written in?: kenshohara/video-classification-3d-cnn-pytorch is primarily written in Python.
How popular is video-classification-3d-cnn-pytorch?: kenshohara/video-classification-3d-cnn-pytorch has 1.1k stars on GitHub.
Where can I find video-classification-3d-cnn-pytorch?: kenshohara/video-classification-3d-cnn-pytorch is on GitHub at https://github.com/kenshohara/video-classification-3d-cnn-pytorch.