Arm's ML inference engine is now legacyware
A once-performant neural-network accelerator for Arm CPUs and Mali GPUs that Arm no longer maintains.

What it does Arm NN is an open-source C++17 SDK that accelerates ML inference on Arm Cortex-A CPUs and Mali GPUs by bridging popular frameworks to Arm-specific hardware. It supports TensorFlow Lite and ONNX models, with a TF Lite Delegate that falls back unsupported operators to the standard TF Lite runtime so your model still runs.
The interesting bit The project leans heavily on the Arm Compute Library for architecture-specific optimizations like SVE2, and ships pre-built binaries tuned for everything from basic arm64-v8a to SVE2-enabled Android builds. There’s even a Dockerized build tool if you want to roll your own.
Key highlights
- TF Lite Delegate offers the widest operator support; TF Lite and ONNX parsers exist but with narrower coverage
- Pre-built binaries for Android 11–14 and Linux, including multi-ISA packages
- Python and C++ APIs available for the delegate
- Targets Ethos-N NPUs via a separate driver stack; Cortex-M users are directed to CMSIS-NN
- Debian packages available for Ubuntu 20.04
Caveats
- No longer actively maintained — Arm explicitly warns against using it with untrusted inputs or in hostile environments
- No further security patches or functional improvements expected
- The README’s claim of being “most performant” is unbenchmarked in the provided sources
Verdict Worth a look if you’re maintaining an existing Arm-based deployment in a trusted environment and need to squeeze performance from older hardware. Everyone else should probably follow Arm’s lead and move on.