Snapdragon Neural Processing Engine SDK
Reference Guide
Snapdragon NPE Runtime

SNPE
Overview

This drawing describes some of the components of Snapdragon NPE Runtime Library which run on the device.
For details on using the library, see Tutorials Setup

At a high level, the library contains the following :
DL Container Loader : Loads a DLC created by one of the snpe-framework-to-dlc conversion tools.

Model Validation : Validates that the loaded DLC is supported by the required runtime. See Supported Network Layers

Runtime Engine : Executes a loaded model on requested runtime(s), including gathering profiling info and supporting UDLs (see User-Defined Layers (UDL) Tutorial).

Partitioning Logic : Processes the model including validation of layers for the required targets, and partitions the model into subnets based on the runtime target they are required to run on, if needed.
For UDLs, the partitioner creates partitions such that the UDLs are executed on the CPU runtime.
If CPU fallback is enabled, the partitioner partitions the model between layers that supported by the target runtime, and the rest that are to be executed on the CPU runtime (if they are supported).

CPU Runtime Runs the model on the CPU; supports 32-bit floating point or 8-bit quantized execution.
GPU Runtime Runs the model on the GPU; supports hybrid or full 16-bit floating point modes.
DSP Runtime Runs the model on Hexagon DSP using Q6 and Hexagon NN, executing on HVX; supports 8-bit quantized execution.
AIP Runtime Runs the model on Hexagon DSP using Q6, Hexagon NN, and HTA; supports 8-bit quantized execution.