tqchen/tinyflow
A minimal deep learning system implemented in ~2k lines of C++ and Lua that demonstrates computational graph construction, operator implementation, and GPU execution.

TinyFlow is an educational deep learning framework that builds a computational graph-based system from scratch. It provides operator implementations using Torch7, an execution runtime with memory management, and a symbolic API inspired by TensorFlow. The system demonstrates key DL system concepts including symbolic differentiation, operator fusion, and modular intermediate representation through NNVM.