← all repositories

mosaicml/streaming

A library for efficiently streaming training datasets from cloud storage during neural network training.

1.5k stars Python Data Tooling
streaming
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

Provides fast, accurate streaming of training data from cloud storage for machine learning workflows. Designed to optimize data loading during deep learning model training by handling distributed dataset delivery. It targets large-scale GPU training scenarios where keeping compute workers fed with data is critical.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.