RedisAI is dead; long live the confusion
A once-active Redis module for in-database ML inference is now unmaintained, renamed, and quietly buried.

What it does
Redis-inference-optimization (formerly RedisAI) is a Redis module that loads deep-learning models—PyTorch, TensorFlow, TensorFlow Lite, ONNXRuntime—and runs inference inside the Redis process. The pitch: keep data and computation close, skip the network hop to a separate serving layer.
The interesting bit
The “data locality” principle sounds obvious but is genuinely hard in practice. Most model-serving stacks are separate services; this one piggybacks on Redis’ event loop and replication. Whether that actually reduces latency meaningfully depends on whether your bottleneck is network round-trips or GPU scheduling.
Key highlights
- Supports four backends: PyTorch 1.11.0, TensorFlow 2.8.0, TFLite 2.0.0, ONNXRuntime 1.11.1 (version 1.2.7)
- GPU builds available via CUDA 11.3 / cuDNN 8.1
- Client libraries in Java, Python, Go, TypeScript, C++, C, Fortran
- Dual-licensed: RSALv2 or SSPLv1
- Requires Redis v6.0.0+
Caveats
- No longer maintained. The README opens with a caution banner; the project was renamed in January 2025 specifically to distance it from Redis’ current AI offerings.
- Backend versions are pinned and finicky; the README warns that serialization formats may not match across versions.
- Docker images are pinned to an older Ubuntu release (Bionic, 18.04).
Verdict
Worth studying if you’re building an embedded inference engine or maintaining legacy RedisAI deployments. For new projects, Redis points to its current AI offerings instead—this repo is essentially a historical artifact with decent documentation.