← all repositories

rom1504/img2dataset

A Python tool that downloads, resizes, and packages image URLs into datasets for ML training.

img2dataset
Velocity · 7d
+2.5
★ / day
Trend
steady
star history

img2dataset enables efficient large-scale image dataset creation by downloading images from URLs and packaging them into usable formats. It supports caption pairing for multimodal training data and can process 100M URLs in 20 hours on a single machine. The tool is specifically positioned as infrastructure for deep learning and multimodal model training.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.