← all repositories
guanfuchen/video_obj

A Chinese-language field guide to video object detection

A curated survey repo that explains why single-frame detection fails on video—and how temporal context fixes it.

505 stars Python Computer VisionML Frameworks
video_obj
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

This repo is a living literature review of video object detection, written in Chinese. It collects papers, datasets, and implementation notes, organized around a core insight: video gives you temporal context that still-image detectors waste or ignore.

The interesting bit

The author doesn’t just list papers—they explain the actual engineering trade-offs. Two camps emerge: one uses motion information (optical flow, tubelet rescoring) to speed up detection by reusing features across frames; the other fuses temporal context to improve accuracy when frames are blurry, occluded, or poorly scaled. The MSRA work on flow-guided feature warping gets singled out as “cleaner” and the only end-to-end trainable approach at the time.

Key highlights

  • Surveys CUHK and MSRA research lines with enough detail to grasp the methodological differences
  • Catalogs video detection datasets: ImageNet VID, YouTube-Objects, YouTube-BoundingBoxes
  • Includes practical tooling: mAP evaluation references, Faster R-CNN and Cascade R-CNN links
  • Explicitly flags Seq-NMS as a small, self-contained module worth trying first
  • Links to a Zhihu discussion that frames the field better than most English intros

Caveats

  • No original code implementations here—this is a reading list and note collection, not a framework
  • README is mostly prose and paper titles; some sections trail off with “TODO” or broken image links
  • Several dataset description paragraphs are truncated mid-sentence

Verdict

Worth bookmarking if you’re entering video detection and read Chinese. Skip it if you need runnable code or prefer English-only resources; treat it as a curated syllabus, not a codebase.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.