Computer Vision

Computer Vision

newcomers · velocity + momentum
01
ruvnet/RuView
+196 ★/daysteady

A $9 ESP32 board turns radio reflections into room-scale presence detection, vital signs, and pose estimation — no lenses, no wearables, no cloud.

71.7k Rust Domain Apps · explained
02
Robbyant/lingbot-map
+133 ★/daysteady

LingBot-Map reconstructs scenes from streaming video in one forward pass, handling 10,000+ frames without iterative optimization.

7.1k Python Computer Vision · explained
03
deepseek-ai/DeepSeek-OCR
+99 ★/daysteady

An LLM-centric vision encoder that squeezes documents into surprisingly few tokens, then lets the language model do the actual reading.

23.3k Python Inference · Serving · explained
04
zai-org/GLM-OCR
+55 ★/daysteady

GLM-OCR squeezes document understanding into a sub-1B model with a layout-aware pipeline and enough deployment options to please any ops team.

6.9k Python Computer Vision · explained
05
datalab-to/chandra
+46 ★/daysteady

Chandra OCR 2 turns scanned chaos into structured Markdown, HTML, or JSON without destroying the layout.

11.1k Python Computer Vision · explained
06

A Python toolkit that reverse-engineers alpha-blended logos, strips C2PA manifests, and diffuses away invisible fingerprints like SynthID.

3k Python Computer Vision · explained
08
ultralytics/ultralytics
+43 ★/daysteady

Ultralytics turned the classic object detector into a unified computer-vision Swiss Army knife you can train via CLI or Python.

58.1k Python Computer Vision · explained
09
microsoft/OmniParser
+40 ★/daysteady

OmniParser extracts clickable elements from raw screenshots so vision models can actually *do* things on a desktop without peeking at the DOM.

24.9k Jupyter Notebook Agents · explained
10
facebookresearch/dinov3
+35 ★/daysteady

DINOv3 is a family of self-supervised vision backbones designed to produce high-quality dense features for everything from semantic segmentation to satellite canopy mapping, often beating task-specialized models out of the box.

10.6k Jupyter Notebook Computer Vision · explained
11
facebookresearch/sam3
+32 ★/daysteady

A foundation model that segments images and videos using open-vocabulary text prompts like "a player in white."

10.4k Python Computer Vision · explained
12
PaddlePaddle/PaddleOCR
+37 ★/daysteady

PaddleOCR turns scans and PDFs into structured Markdown or JSON using a tiny vision-language model that punches above its weight class.

81.3k Python Computer Vision · explained
13
upscayl/upscayl
+33 ★/daysteady

Upscayl wraps Real-ESRGAN and Vulkan in an Electron app so you can enlarge images without paying Topaz Gigapixel's rent.

45.9k TypeScript Computer Vision · explained
14
roboflow/supervision
+32 ★/daysteady

A model-agnostic Python toolkit that handles the boring parts of computer vision: annotations, dataset juggling, and tracking.

41.5k Python Computer Vision · explained
15
rednote-hilab/dots.ocr
+28 ★/daysteady

A single small vision-language model that parses documents, charts, and even street signs into structured text or SVG code.

8.9k Python Computer Vision · explained
16
facebookresearch/vggt
+28 ★/daysteady

VGGT turns one image—or a hundred—into camera poses, depth maps, point clouds, and trackable 3D points without any optimization loop.

13.3k Python Computer Vision · explained
17
facebookresearch/sam2
+28 ★/daysteady

SAM 2 extends the original Segment Anything to video with streaming memory, turning one-off image masks into persistent object tracking.

19.3k Jupyter Notebook Computer Vision · explained
18
hiroi-sora/Umi-OCR
+29 ★/daysteady

A Qt-based desktop app for screenshot, batch, and PDF OCR without phoning home to any API.

45k Python Computer Vision · explained
19
facefusion/facefusion
+28 ★/daysteady

A Python toolkit for face manipulation built around job queues, remixable steps, and headless automation rather than one-off GUI wizardry.

28.7k Python Image · Video · Audio · explained
20
screenpipe/screenpipe
+27 ★/daysteady

screenpipe records everything you see, say, and hear—locally, searchable, and feedable to AI agents.

19.2k Rust Agents · explained
loading more…

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.