yahoo/TensorFlowOnSpark
Distributed deep learning training framework for running TensorFlow programs on Apache Spark and Hadoop clusters.

Velocity · 7d
+1.1
★ / day
Trend
→steady
star history
TensorFlowOnSpark enables distributed TensorFlow training and inference across GPU and CPU servers in Spark/Hadoop clusters. It provides a Spark-compatible API that manages the TensorFlow cluster lifecycle including startup, data ingestion (with options for native TensorFlow file reading or Spark RDD feed), and graceful shutdown. The project minimizes code changes needed to migrate existing TensorFlow programs to a shared grid environment.