Snapdragon Neural Processing Engine SDK
Reference Guide
Overview of UDO

Introduction

SNPE provides the ability for users to plug in custom neural network operations that may not be inherently supported by the runtime engine in the form of User-Defined Operations (hereafter referred to as UDO). These could be operations defined in popular training frameworks such as Tensorflow or custom operations that are built based as framework extensions but not available in the SNPE SDK. They can be natively executed on any of the supported hardware accelerators for which they are implemented. SNPE provides the infrastructure to execute these operations in a seamless fashion with little to no overhead compared to executing internally supported operations.

Anatomy of a UDO package

SNPE allows users to provide UDO implementations in the form of dynamic libraries that can be queried, loaded and exercised to execute inference using kernels defined within them. SNPE promotes the notion of a 'UDO package' with which a user can easily express the association between the different components of a UDO. This notion is central to all the tools that enable users to create UDO packages to be used in network inference. However, it is to be noted that SNPE still directly interfaces with the various UDO libraries at runtime and not with the UDO package construct. Thus users are free to just build standalone libraries without being strictly bound to this notion of a package.
The figure below illustrates the concept of a UDO package:

UDO


As seen from the picture, a UDO package consists of a registration component and an implementation component. They are usually expressed separately with one registration library and a set of implementation libraries, one for each hardware accelerator for which an implementation kernel is available. Users can optionally build both components into a single library if they so wish.

The registration library consists of methods that specify all user-defined operations and the hardware cores they are designed for. It also consists of methods that allow operations to be validated for sanity at the time of network creation. The registration library is loaded and executed on the ARM CPU.

The hardware-specific implementation libraries expose several other methods that implement operation instance creation, execution, profiling and destruction. These are implemented with programming constructs supported from corresponding software platforms, such as OpenCL for GPU and Hexagon-NN SDK for DSP. While core-specific implementation files may differ entirely in source, they are all required to interface with SNPE using a set of C APIs defined in $SDK_ROOT/share/SnpeUdo/include/SnpeUdo. The complete details on these APIs can be obtained from C++ API.

UDO workflow

SNPE recommends the following workflow in developing and integrating a UDO into the runtime:

UDO


The first step in the workflow is to identify the operations in the model that need to be expressed as user-defined operations and describing their attributes through a configuration file. The format and contents of this file are described in Defining a UDO.

The next set of steps produce the components of a UDO package by creating source files for the UDO kernels and compiling them against appropriate tool-chains to generate dynamic libraries specific to hardware cores such as the GPU and DSP. SNPE provides a tool called snpe-udo-package-generator that assists users in creating common skeleton code for interfacing with SNPE UDO APIs and leaves placeholders for users to fill in the kernel implementation. It also generates makefiles for common targets such as x86, Android, and for runtimes per target specified in the config file. For more details on the package generation refer to Creating a UDO Package. For details on compiling the UDO package for a specific runtime refer to Compiling a UDO package.

The config file created in the first step is also required to be used by the SNPE model conversion tools along with the actual trained model to allow interpretation of the user-defined operations using definitions from the file. The resulting DLC files can then be inspected using tools like snpe-dlc-info to probe the attributes of the UDOs in the model. For details on creating (and optionally quantizing) DLCs with UDOs refer to Preparing a model with UDO. Optionally, models with UDOs can also be quantized using SNPE quantization tools to use with fixed-point runtimes such as DSP. The quantizer tool estimates the quantization ranges for activations from all layers in the network including UDOs. Since the tool runs offline on an x86 host machine, it is required to have a CPU implementation for the UDO in order to perform inference through the entire network. This is also illustrated in dotted lines in the workflow diagram. Refer to Quantizing a DLC with UDO for details on the quantization process.

The final step in this workflow is to be able to actually execute network models with UDOs. SNPE applications use the UDO package to register UDO implementations within the process that runs inference on select network models. It should be noted that these UDOs can be exercised by multiple instances of SNPE simultaneously without race conditions, which increases the overall throughput for network inference. For more details on the UDO package registration process refer to Running a model with UDO.

If the DSP implementation library of the UDO is not signed for execution on a signed process domain (the default for a SNPE applicaiton), it is required to request the use of an unsigned process domain. Unsigned process domains apply only to the DSP target, and allow SNPE to use unsigned UDO implementation libraries. To see how to utilize an unsigned process domain with the SNPE application, refer to Running a model with UDO.

UDO Backward Compatibility

This section specifies limitations of UDO packages :

  • The UDOs compiled for DSP V68 on a particular SNPE release version, needs to be used with same release version and can't be used with different release version.
  • Users need to recompile UDO packages generated for DSP V68 by using correct QNN SDK which is compatible with a particular SNPE release.
  • Users need to regenerate the UDO package with SNPE 1.49 SDK for DSP V68 due to some known limitations in snpe-udo-package-generator. With this limitation, users cannot use UDO package generated prior to 1.49 on SNPE SDK 1.49. Please note this limitation is only for DSP V68 UDO packages.