← all repositories
espressif/esp-dl

Neural nets on $2 microcontrollers, now with less suffering

Espressif's official inference framework squeezes quantized models onto ESP32 chips through aggressive memory planning and LUT tricks.

esp-dl
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does ESP-DL is Espressif’s in-house neural network inference engine for ESP32-series SoCs. You quantize models from PyTorch/TensorFlow/ONNX using the companion ESP-PPQ tool, export to a FlatBuffers-based .espdl format, and run inference through a C API that plugs into ESP-IDF. A static memory planner auto-allocates layers to squeeze into whatever RAM you specify, and there’s a small zoo of pre-trained models (YOLO variants, MobileNetV2, a cat detector) ready to drop in as ESP-IDF components.

The interesting bit The 8-bit LUT activation is the kind of boring optimization that actually matters: every activation except ReLU/PReLU gets precomputed into a lookup table, so you can swap in exotic activations without paying extra compute cost. Dual-core scheduling for Conv2D and DepthwiseConv2D is similarly pragmatic—no fancy runtime, just static partitioning of heavy ops across the ESP32’s two cores.

Key highlights

  • Custom .espdl format uses FlatBuffers for zero-copy deserialization (lighter than Protobuf, claims the README)
  • Static memory planner auto-lays out layers given a user-specified RAM budget
  • 8-bit LUT activations decouple activation choice from computational cost
  • Dual-core scheduling for Conv2D/DepthwiseConv2D on applicable chips
  • AutoQuant and “espdl-quantize skill” tools now automate quantization strategy search
  • Model Zoo includes YOLO11n, YOLO26, ESPDet-Pico, and a cat detection model

Caveats

  • Requires ESP-IDF release/v5.3 or newer; not a standalone framework
  • Operator support is finite—check the operator support matrix before falling in love with an architecture
  • Schema updates can break backward compatibility (v3.1.0 models won’t load on older runtimes)

Verdict Worth a look if you’re already in the ESP-IDF ecosystem and need to run inference on-device without external accelerators. Skip it if you’re targeting non-Espressif silicon or need dynamic shapes and runtime graph mutation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.