Is five-video-classification-methods open source?

Yes — harvitronix/five-video-classification-methods is open source, released under the MIT license.

What language is five-video-classification-methods written in?

harvitronix/five-video-classification-methods is primarily written in Python.

How popular is five-video-classification-methods?

harvitronix/five-video-classification-methods has 1.2k stars on GitHub.

Where can I find five-video-classification-methods?

harvitronix/five-video-classification-methods is on GitHub at https://github.com/harvitronix/five-video-classification-methods.

← all repositories

harvitronix/five-video-classification-methods

Five ways to teach a neural network what happens in a video

A 2017-era reference implementation comparing ConvNet, LSTM, 3D CNN, and MLP approaches to video classification, all wired up for the UCF101 dataset.

★1.2k stars Python Computer Vision ML Frameworks

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

This repo implements five classic architectures for classifying human actions in video, using the UCF101 dataset of 101 action categories. You get frame-by-frame CNN classification, CNN-to-LSTM pipelines (both two-stage and end-to-end LRCN), CNN-to-MLP, and 3D convolutional networks. It’s essentially a working notebook made public: run the data extraction scripts, wait eight hours for feature extraction on a mid-tier GPU, then train your pick of models.

The interesting bit

The value is in the side-by-side comparison, not novelty. The author wired up five approaches that were standard circa 2017 so you can see how they differ in complexity and plumbing — from “just pretend video is a stack of images” to “actually model spatiotemporal cubes with 3D convolutions.” The LRCN variant (time-distributed CNN feeding an RNN in one network) is the most architecturally elegant of the bunch.

Key highlights

Five model architectures defined in a single models.py for easy comparison
Full data pipeline from raw UCF101 videos to frame sequences and CSV manifests
Feature extraction cached to disk so LSTM/MLP training doesn’t re-run CNN forward passes
TensorBoard and CSV logging built in
Multiple worker support in the data generator (checked off the TODO list)

Caveats

No demo script: you cannot point a finished model at a new video and get a prediction without writing code yourself
Locked to Keras 2 and TensorFlow 1.x — this is legacy stack territory now
Requires ffmpeg and manual path tweaking on non-Unix systems
Data augmentation and optical flow are on the TODO list, not implemented

Verdict

Worth a look if you’re teaching computer vision or need a baseline to beat on UCF101. Skip it if you want production-ready video understanding — modern transformers and pre-trained video backbones have left this approach behind, and the dependency stack shows its age.

Frequently asked

What is harvitronix/five-video-classification-methods?: A 2017-era reference implementation comparing ConvNet, LSTM, 3D CNN, and MLP approaches to video classification, all wired up for the UCF101 dataset.
Is five-video-classification-methods open source?: Yes — harvitronix/five-video-classification-methods is open source, released under the MIT license.
What language is five-video-classification-methods written in?: harvitronix/five-video-classification-methods is primarily written in Python.
How popular is five-video-classification-methods?: harvitronix/five-video-classification-methods has 1.2k stars on GitHub.
Where can I find five-video-classification-methods?: harvitronix/five-video-classification-methods is on GitHub at https://github.com/harvitronix/five-video-classification-methods.