← all repositories
DrewNF/Tensorflow_Object_Tracking_Video

A 2017 master thesis that glued YOLO to TensorBox

Video object tracking built from off-the-shelf parts for the ImageNet VID competition, with the rough edges left visible.

502 stars Python Computer Vision
Tensorflow_Object_Tracking_Video
Velocity · 7d
+0.1
★ / day
Trend
steady
star history

What it does

This is a master’s thesis project for video object tracking in TensorFlow, built to compete in ImageNet’s VID challenge. It chains together existing open-source implementations: YOLO or TensorBox for detection, hand-rolled post-processing for temporal smoothing, and Inception for classification. You feed it a video file; it spits out an annotated MP4.

The interesting bit

The architecture explicitly copies the T-CNN paper’s cascade strategy—detection first, then temporal information, then context—but replaces the trainable temporal components with non-trainable post-processing algorithms in Utils_Video.py. It’s a frankenstein that admits it’s a frankenstein.

Key highlights

  • Supports YOLO (single-class detection) and TensorBox (multi-class) pipelines
  • Includes dataset preprocessing scripts with brute-force and lightweight modes
  • Provides trained weights for both Inception and TensorBox via Mega.nz links
  • Hardcoded to 640×480 PNG for TensorBox; you’ll need to hack the resize scripts for other dimensions
  • Requires OpenCV, TensorFlow, and Python; installation guide included

Caveats

  • Temporal information is “retrieved through some post processing algorithm… NOT TRAINABLE” — the README’s own caps
  • Weight download links are from 2017 (Mega.nz); longevity unclear
  • Early output had frame-ordering bugs causing flicker; author notes this was “then solved” but the fix isn’t detailed
  • “I will soon put a weight file to download” — still pending as of last update 10-03-2017

Verdict

Worth a look if you’re studying how to bolt together 2016-era detection models for video, or if you need a concrete (if dated) reference for the ImageNet VID pipeline. Skip it if you want a maintained, trainable end-to-end tracker—this is explicitly thesis-grade glue code with the scaffolding still showing.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.