Keras models without the Python runtime tax
A minimal C++ inference engine for Keras networks, built for when you need predictions but can't ship a Python interpreter.

What it does
Takes a trained Keras neural network, serializes its weights and architecture to a plain text file, then runs inference in pure C++ using nothing but vector<vector<vector<float>>> for image input. No TensorFlow, no Python runtime, no ceremony. The included keras_model.h and keras_model.cc handle the forward pass after you run the dump_to_simple_cpp.py script on your saved model.
The interesting bit
The project deliberately strips away everything but the math. It targets the Theano backend specifically, stores weights in readable text rather than binary blobs, and the entire C++ core appears to be two files you could audit in an afternoon. The MNIST CNN example compiles with a single g++ command.
Key highlights
- Supports ReLU and Softmax activations only; extendable if you need more
- Includes a
test_run.shscript that verifies bit-for-bit prediction parity against Keras itself - Works with the Theano backend (not TensorFlow)
- Input format is explicit nested vectors — no hidden tensor abstractions
- ~680 stars suggests it solved a real deployment headache for embedded or restricted environments
Caveats
- Only ReLU and Softmax are implemented; other activations require manual extension
- Theano backend only — if you’re on modern TensorFlow/Keras, this is a mismatch
- README warns the code is “prepared to support simple Convolutional network” — complex architectures may need work
Verdict
Worth a look if you’re deploying to hardware that can’t run Python or you need a tiny, inspectable inference engine you can hack on. Skip it if you need broad layer support, modern Keras compatibility, or production-grade optimization.