← all repositories
asingh33/CNNGestureRecognizer

Hand gestures meet 2017-era Keras in this time-capsule CNN project

A complete webcam-to-gesture pipeline that shows its age in the dependencies but still demonstrates how to wire OpenCV preprocessing to a trainable CNN.

1k stars Python Computer VisionML Frameworks
CNNGestureRecognizer
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

CNNGestureRecognizer is a Python application that captures hand gestures via webcam, preprocesses the frames with OpenCV, and classifies them into one of five categories: OK, PEACE, STOP, PUNCH, or NOTHING. It ships with 4,015 training images, pretrained weights, and a small Tkinter UI for switching between prediction, retraining, and layer visualization modes.

The interesting bit

The author treats this as an educational scaffold rather than a product. You can peek inside the model’s “thinking” with built-in feature-map visualization, and the README walks through exactly how the OpenCV preprocessing works—two different pipelines (binary threshold vs. skin-color masking) for different lighting conditions. There’s even a Chrome Dino game hookup buried in the truncated conclusion.

Key highlights

  • Pretrained on 4,015 self-collected images (803 per class) for 15 epochs; weights split by OS due to backend serialization quirks
  • Two capture modes: Binary Mode for clean backgrounds, SkinMask Mode for HSV-based skin detection when lighting is good
  • Real-time prediction with an in-app probability bar chart; can dump results to JSON for external plotting
  • Layer visualization via Keras backend functions—see which filters activate on your own gesture images
  • Includes hooks to retrain or extend the model with custom gestures

Caveats

  • Dependency stack is frozen in 2017: Python 3.6.1, Keras 2.0.2, TensorFlow 1.2.1, and Theano 0.9.0 (explicitly noted as obsolete)
  • Pretrained weights are 150 MB each and hosted on Google Drive, not in the repo
  • The CNN architecture is a standard MNIST-style stack; the author admits it’s “pretty common” and not novel
  • OS-specific weight files suggest serialization fragility across platforms

Verdict

Grab this if you’re teaching computer vision fundamentals or need a hackable baseline for gesture control. Skip it if you want modern MediaPipe-level accuracy without the dependency archaeology.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.