Computer Vision

Computer Vision

heavyweights · velocity + momentum
01
roboflow/supervision
+530 ★/dayaccelerating

A model-agnostic Python toolkit that handles the boring parts of computer vision: annotations, dataset juggling, and tracking.

43.6k Python Computer Vision · explained
02
ruvnet/RuView
+363 ★/dayaccelerating

A $9 ESP32 board turns radio reflections into room-scale presence detection, vital signs, and pose estimation — no lenses, no wearables, no cloud.

72.9k Rust Domain Apps · explained
03
PaddlePaddle/PaddleOCR
+332 ★/dayaccelerating

PaddleOCR turns scans and PDFs into structured Markdown or JSON using a tiny vision-language model that punches above its weight class.

81.8k Python Computer Vision · explained
04
opencv/opencv
+159 ★/dayaccelerating

OpenCV is the de facto standard for computer vision, and its README is almost aggressively humble about it.

88.9k C++ Computer Vision · explained
05
NVlabs/Eagle
+61 ★/dayaccelerating

Eagle is less a single model than NVIDIA's internal R&D pipeline for multimodal AI, now open-sourced with three generations of VLMs and a grounding specialist.

2.4k Python Language Models · explained
06
wiltodelta/remove-ai-watermarks
+39 ★/daycooling

A Python toolkit that reverse-engineers alpha-blended logos, strips C2PA manifests, and diffuses away invisible fingerprints like SynthID.

3.2k Python Computer Vision · explained
07
ultralytics/ultralytics
+45 ★/dayaccelerating

Ultralytics turned the classic object detector into a unified computer-vision Swiss Army knife you can train via CLI or Python.

58.3k Python Computer Vision · explained
08
hiroi-sora/Umi-OCR
+40 ★/dayaccelerating

A Qt-based desktop app for screenshot, batch, and PDF OCR without phoning home to any API.

45.1k Python Computer Vision · explained
10
facebookresearch/sam-3d-body
+32 ★/dayaccelerating

A foundation model that turns one image into a full 3D body mesh, optionally guided by keypoints or masks like the original SAM.

3.2k Python Computer Vision · explained
12
upscayl/upscayl
+35 ★/dayaccelerating

Upscayl wraps Real-ESRGAN and Vulkan in an Electron app so you can enlarge images without paying Topaz Gigapixel's rent.

46k TypeScript Computer Vision · explained
13
datalab-to/surya
+33 ★/dayaccelerating

Surya does OCR, layout analysis, reading order, and table recognition in 90+ languages from a single VLM.

20.8k Python Computer Vision · explained
14
Robbyant/lingbot-map
+23 ★/daycooling

LingBot-Map reconstructs scenes from streaming video in one forward pass, handling 10,000+ frames without iterative optimization.

7.2k Python Computer Vision · explained
15
blakeblackshear/frigate
+25 ★/dayaccelerating

Frigate is a local NVR that runs AI object detection on IP cameras without phoning home to the cloud.

33.7k TypeScript Computer Vision · explained
17
screenpipe/screenpipe
+22 ★/daycooling

screenpipe records everything you see, say, and hear—locally, searchable, and feedable to AI agents.

19.2k Rust Agents · explained
18
tesseract-ocr/tesseract
+20 ★/dayaccelerating

HP's abandoned text-recognition project became the open-source default for turning images into words.

74.6k C++ Computer Vision · explained
19
facebookresearch/sam3
+18 ★/daycooling

A foundation model that segments images and videos using open-vocabulary text prompts like "a player in white."

10.5k Python Computer Vision · explained
loading more…

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.