Snapdragon Neural Processing Engine SDK
Reference Guide
Tools

This chapter describes the various SDK tools and features.


snpe-net-run

snpe-net-run loads a DLC file, loads the data for the input tensor(s), and executes the network on the specified runtime.

DESCRIPTION:
------------
Example application demonstrating how to load and execute a neural network
using the SNPE C++ API.


REQUIRED ARGUMENTS:
-------------------
  --container  <FILE>   Path to the DL container containing the network.
  --input_list <FILE>   Path to a file listing the inputs for the network.


OPTIONAL ARGUMENTS:
-------------------
  --use_gpu             Use the GPU runtime for SNPE.
  --use_dsp             Use the DSP fixed point runtime for SNPE.
  --use_aip             Use the AIP fixed point runtime for SNPE.
  --debug               Specifies that output from all layers of the network
                        will be saved.
  --output_dir <DIR>    The directory to save output to. Defaults to ./output
  --storage_dir <DIR>   The directory to store SNPE metadata files
  --encoding_type <VAL> Specifies the encoding type of input file. Valid settings are "nv21".
                        Cannot be combined with --userbuffer*.
  --userbuffer_float    [EXPERIMENTAL] Specifies to use userbuffer for inference, and the input type is float.
                        Cannot be combined with --encoding_type.
  --userbuffer_tf8      [EXPERIMENTAL] Specifies to use userbuffer for inference, and the input type is tf8exact0.
                        Cannot be combined with --encoding_type.
  --perf_profile <VAL>  Specifies perf profile to set. Valid settings are "system_settings" , "power_saver" , "balanced" ,
                        "default" , "high_performance" , "sustained_high_performance" , and "burst".
                        NOTE: "balanced" and "default" are the same.  "default" is being deprecated in the future.
  --profiling_level <VAL> Specifies the profiling level.  Valid settings are "off", "basic" and "detailed".
                          Default is detailed.
  --enable_cpu_fallback Enables cpu fallback functionality. Defaults to disable mode.
  --input_name <INPUT_NAME> Specifies the name of input for which dimensions are specified.

  --input_dimensions <INPUT_DIM>  Specifies new dimensions for input whose name is specified in input_name. e.g. "1,224,224,3".
                        For multiple inputs, specify --input_name and --input_dimensions multiple times.
  --gpu_mode <VAL>      Specifies gpu operation mode. Valid settings are "default", "float16".
                        default = float32 math and float16 storage (equiv. use_gpu arg).
                        float16 = float16 math and float16 storage.
  --help                 Show this help message.
  --version              Show SNPE Version Number.

This binary outputs raw output tensors into the output folder by default. Examples of using snpe-net-run can be found in Running AlexNet tutorial.

Additional details:

  • Running batched inputs:
    • snpe-net-run is able to automatically batch the input data. The batch size is indicated in the model container (DLC file) but can also be set using the "input_dimensions" argument passed to snpe-net-run. Users do not need to batch their input data. If the input data is not batch, the input size needs to be a multiple of the size of the input data files. snpe-net-run would group the provided inputs into batches and pad the incomplete batches (if present) with zeros.

      In the example below, the model is set to accept batches of three inputs. So, the inputs are automatically grouped together to form batches by snpe-net-run and padding is done to the final batch. Note that there are five output files generated by snpe-net-run:

            …
            Processing DNN input(s):
            cropped/notice_sign.raw
            cropped/trash_bin.raw
            cropped/plastic_cup.raw
            Processing DNN input(s):
            cropped/handicap_sign.raw
            cropped/chairs.raw
            Applying padding
  • input_list argument:
    • snpe-net-run can take multiple input files as input data per iteration, and specify multiple output names, in an input list file formated as below:

            #<output_name>[<space><output_name>]
            <input_layer_name>:=<input_layer_path>[<space><input_layer_name>:=<input_layer_path>]
            …

      The first line starting with a "#" specifies the output layers' names. If there is more than one output, a whitespace should be used as a delimiter. Following the first line, you can use multiple lines to supply input files, one line per iteration, and each line only supply one layer.If there is more than one input per line, a whitespace should be used as a delimiter.

      Here is an example, where the layer names are "Input_1" and "Input_2", and inputs are located in the path "Placeholder_1/real_input_inputs_1/". Its input list file should look like this:

            #Output_1 Output_2
            Input_1:=Placeholder_1/real_input_inputs_1/0-0#e6fb51.rawtensor Input_2:=Placeholder_1/real_input_inputs_1/0-1#8a171b.rawtensor
            Input_1:=Placeholder_1/real_input_inputs_1/1-0#67c965.rawtensor Input_2:=Placeholder_1/real_input_inputs_1/1-1#54f1ff.rawtensor
            Input_1:=Placeholder_1/real_input_inputs_1/2-0#b42dc6.rawtensor Input_2:=Placeholder_1/real_input_inputs_1/2-1#346a0e.rawtensor

      Note: If the batch dimension of the model is greater than 1, the number of batch elements in the input file has to either match the batch dimension specified in the DLC or it has to be one. In the latter case, snpe-net-run will combine multiple lines into a single input tensor.

  • Running AIP Runtime:
    • AIP Runtime requires a DLC which was quantized, and HTA sections were generated offline. See Adding HTA sections
    • AIP Runtime does not support debug_mode
    • AIP Runtime does not support batch

snpe_bench.py

python script snpe_bench.py runs a DLC neural network and collects benchmark performance information.

usage: snpe_bench.py [-h] -c CONFIG_FILE [-o OUTPUT_BASE_DIR_OVERRIDE]
                     [-v DEVICE_ID_OVERRIDE] [-r HOST_NAME] [-a]
                     [-t DEVICE_OS_TYPE_OVERRIDE] [-d] [-s SLEEP]
                     [-b USERBUFFER_MODE] [-p PERFPROFILE] [-l PROFILINGLEVEL]
                     [-json] [-cache]

Run the snpe_bench

required arguments:
  -c CONFIG_FILE, --config_file CONFIG_FILE
                        Path to a valid config file
                        Refer to sample config file config_help.json for more
                        detail on how to fill params in config file

optional arguments:
  -o OUTPUT_BASE_DIR_OVERRIDE, --output_base_dir_override OUTPUT_BASE_DIR_OVERRIDE
                        Sets the output base directory.
  -v DEVICE_ID_OVERRIDE, --device_id_override DEVICE_ID_OVERRIDE
                        Use this device ID instead of the one supplied in config
                        file. Cannot be used with -a
  -r HOST_NAME, --host_name HOST_NAME
                        Hostname/IP of remote machine to which devices are
                        connected.
  -a, --run_on_all_connected_devices_override
                        Runs on all connected devices, currently only support 1.
                        Cannot be used with -v
  -t DEVICE_OS_TYPE_OVERRIDE, --device_os_type_override DEVICE_OS_TYPE_OVERRIDE
                        Specify the target OS type, valid options are
                        ['android', 'android-aarch64', 'le', 'le64_gcc4.9',
                        'le_oe_gcc6.4', 'le64_oe_gcc6.4']
  -d, --debug           Set to turn on debug log
  -s SLEEP, --sleep SLEEP
                        Set number of seconds to sleep between runs e.g. 20
                        seconds
  -b USERBUFFER_MODE, --userbuffer_mode USERBUFFER_MODE
                        [EXPERIMENTAL] Enable user buffer mode, default to
                        float, can be tf8exact0
  -p PERFPROFILE, --perfprofile PERFPROFILE
                        Set the benchmark operating mode (balanced, default,
                        sustained_high_performance, high_performance,
                        power_saver, system_settings)
  -l PROFILINGLEVEL, --profilinglevel PROFILINGLEVEL
                        Set the profiling level mode (off, basic, detailed).
                        Default is basic.
  -json, --generate_json
                        Set to produce json output.
  -cache, --enable_init_cache
                        Enable init caching mode to accelerate the network
                        building process. Defaults to disable.

snpe-caffe-to-dlc

snpe-caffe-to-dlc converts a Caffe model into an SNPE DLC file.

usage: snpe-caffe-to-dlc [-h] [--input_network INPUT_NETWORK] [-o OUTPUT_PATH]
                         [--copyright_file COPYRIGHT_FILE]
                         [--model_version MODEL_VERSION]
                         [--disable_batchnorm_folding]
                         [--input_type INPUT_NAME INPUT_TYPE]
                         [--input_encoding INPUT_NAME INPUT_ENCODING]
                         [--validation_target RUNTIME_TARGET PROCESSOR_TARGET]
                         [--strict] [--debug [DEBUG]]
                         [-b CAFFE_BIN] [--udl UDL_MODULE FACTORY_FUNCTION]

Script to convert caffemodel into a DLC file.

optional arguments:
  -h, --help            show this help message and exit

required arguments:
  --input_network INPUT_NETWORK, -i INPUT_NETWORK
                        Path to the source framework model.

optional arguments:
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        Path where the converted Output model should be
                        saved.If not specified, the converter model will be
                        written to a file with same name as the input model
  --copyright_file COPYRIGHT_FILE
                        Path to copyright file. If provided, the content of
                        the file will be added to the output model.
  --model_version MODEL_VERSION
                        User-defined ASCII string to identify the model, only
                        first 64 bytes will be stored
  --disable_batchnorm_folding
                        If not specified, converter will try to fold batchnorm
                        into previous convolution layer
  --input_type INPUT_NAME INPUT_TYPE, -t INPUT_NAME INPUT_TYPE
                        Type of data expected by each input op/layer. Type for
                        each input is |default| if not specified. For example:
                        "data" image.Note that the quotes should always be
                        included in order to handle special characters,
                        spaces,etc. For multiple inputs specify multiple
                        --input_type on the command line. Eg: --input_type
                        "data1" image --input_type "data2" opaque These
                        options get used by DSP runtime and following
                        descriptions state how input will be handled for each
                        option. Image: input is float between 0-255 and the
                        input's mean is 0.0f and the input's max is 255.0f. We
                        will cast the float to uint8ts and pass the uint8ts to
                        the DSP. Default: pass the input as floats to the dsp
                        directly and the DSP will quantize it. Opaque: assumes
                        input is float because the consumer layer(i.e next
                        layer) requires it as float, therefore it won't be
                        quantized.Choices supported:['image', 'default',
                        'opaque']
  --input_encoding INPUT_NAME INPUT_ENCODING, -e INPUT_NAME INPUT_ENCODING
                        Image encoding of the source images. Default is bgr.
                        Eg usage: "data" rgba Note the quotes should always be
                        included in order to handle special characters,
                        spaces, etc. For multiple inputs specify
                        --input_encoding for each on the command line. Eg:
                        --input_encoding "data1" rgba --input_encoding "data2"
                        other. Use options: color encodings(bgr,rgb, nv21...)
                        if input is image; time_series: for inputs of rnn
                        models; other: if input doesn't follow above
                        categories or is unknown. Choices supported:['bgr',
                        'rgb', 'rgba', 'argb32', 'nv21', 'time_series',
                        'other']
  --validation_target RUNTIME_TARGET PROCESSOR_TARGET
                        A combination of processor and runtime target against
                        which model will be validated. Choices for
                        RUNTIME_TARGET: {cpu, gpu, dsp}. Choices for
                        PROCESSOR_TARGET: {snapdragon_801, snapdragon_820,
                        snapdragon_835}.If not specified, will validate model
                        against {snapdragon_820, snapdragon_835} across all
                        runtime targets.
  --strict              If specified, will validate in strict mode whereby
                        model will not be produced if it violates constraints
                        of the specified validation target. If not specified,
                        will validate model in permissive mode against the
                        specified validation target.
  --debug [DEBUG]       Run the converter in debug mode.
  -b CAFFE_BIN, --caffe_bin CAFFE_BIN
                        Input caffe binary file containing the weight data
  --udl UDL_MODULE FACTORY_FUNCTION
                        Option to add User Defined Layers. Provide Filename,
                        Function name.1.Filename: Name of python module to
                        load for registering custom udl(note: must be in
                        PYTHONPATH). If file part of package list the
                        package.filename as you would when doing a python
                        import.2.Function name: Name of the udl factory
                        function that return a dictionary of key layer type
                        and value function callback.

Examples of using this script can be found in Converting Models from Caffe to SNPE.

Additional details:

  • input_encoding argument:
    • Specifies the encoding type of input images.
    • A preprocessing layer is added to the network to convert input images from the specified encoding to BGR, the encoding used by Caffe.
    • The encoding preprocessing layer can be seen when using snpe-dlc-info.
    • Allowed options are:
      • argb32: The ARGB32 format consists of 4 bytes per pixel: one byte for Red, one for Green, one for Blue and one for the alpha channel. The alpha channel is ignored. For little endian CPUs, the byte order is BGRA. For big endian CPUs, the byte order is ARGB.
      • rgba: The RGBA format consists of 4 bytes per pixel: one byte for Red, one for Green, one for Blue and one for the alpha channel. The alpha channel is ignored. The byte ordering is endian independent and is always RGBA byte order.
      • nv21: NV21 is the Android version of YUV. The Chrominance is down sampled and has a sub sampling ratio of 4:2:0. Note that this image format has 3 channels, but the U and V channels are subsampled. For every four Y pixels there is one U and one V pixel.
      • bgr: The BGR format consists of 3 bytes per pixel: one byte for Red, one for Green and one for Blue. The byte ordering is endian independent and is always BGR byte order.
    • This argument is optional. If omitted then input image encoding is assumed to be BGR and no preprocessing layer is added.
    • See input_preprocessing for more details.
  • disable_batchnorm_folding argument:
    • The disable batchnorm folding argument allows the user to turn off the optimization that folds batchnorm and batchnorm + scaling layers into previous convolution layers when possible.
    • This argument is optional. If omitted then the converter will fold batchnorm and batchnorm + scaling layers into previous convolution layers wherever possible as an optimization. When this occurs the names of the folded batchnorm and scale layers are concatenated to the convolution layer it was folded into.
      • For example: if batchnorm layer named 'bn' and scale layer named 'scale' are folded into a convolution layer named 'conv', the resulting dlc will show the convolution layer to be named 'conv.bn.scale'.
  • input_type argument:
    • Specifies the expected data type for a certain input layer name.
    • This argument can be passed more than once if you want to specify the expected data type of two or more input layers.
    • input_type argument takes INPUT_NAME followed by INPUT_TYPE.
    • This argument is optional. If omitted for a certain input layer then the expected data type will be of type:default.
    • Allowed options are:
      • default: Specifies that the input contains floating-point values.
      • image: Specifies that the input contains floating-point values that are all integers in the range 0..255.
      • opaque: Specifies that the input contains floating-point values that should be passed to the selected runtime without modification.
        For example an opaque tensor is passed directly to the DSP without quantization.
    • For example: [–input_type "data" image –input_type "roi" opaque].

snpe-caffe2-to-dlc

snpe-caffe2-to-dlc converts a Caffe2 model into an SNPE DLC file.

usage: snpe-caffe2-to-dlc [-h] -p PREDICT_NET -e EXEC_NET -i INPUT_DIM
                          INPUT_DIM [-d DLC] [--enable_preprocessing]
                          [--encoding {argb32,rgba,nv21,bgr}]
                          [--opaque_input [OPAQUE_INPUT [OPAQUE_INPUT ...]]]
                          [--model_version MODEL_VERSION]
                          [--reorder_list REORDER_LIST [REORDER_LIST ...]]
                          [--verbose]

Script to convert caffe2 networks into a DLC file.

optional arguments:
  -h, --help            show this help message and exit

required arguments:
  -p PREDICT_NET, --predict_net PREDICT_NET
                        Input caffe2 binary network definition protobuf
  -e EXEC_NET, --exec_net EXEC_NET
                        Input caffe2 binary file containing the weight data
  -i INPUT_DIM INPUT_DIM, --input_dim INPUT_DIM INPUT_DIM
                        The names and dimensions of the network input layers
                        specified in the format "input_name" B,C,H,W. Ex "data"
                        1,3,224,224. Note that the quotes should always be
                        included in order to handle special characters,
                        spaces, etc. For multiple inputs specify multiple
                        --input_dim on the command line like: --input_dim
                        "data1" 1,3,224,224 --input_dim "data2" 1,3,50,100 We
                        currently assume that all inputs have 4 dimensions.

optional arguments:
  -d DLC, --dlc DLC     Output DLC file containing the model. If not
                        specified, the data will be written to a file with
                        same name and location as the predict_net file with a
                        .dlc extension
  --copyright_file COPYRIGHT_FILE
                        Path to copyright file. If provided, the content of
                        the file will be added to the dlc.
  --enable_preprocessing
                        If specified, the converter will enable image mean
                        subtraction and cropping specified by ImageInputOp. Do
                        NOT enable if there is not a ImageInputOp present in
                        the Caffe2 network.
  --encoding {argb32,rgba,nv21,bgr}
                        Image encoding of the source images. Default is bgr if
                        not specified
  --opaque_input [OPAQUE_INPUT [OPAQUE_INPUT ...]]
                        A space separated list of input blob names which
                        should be treated as opaque (non-image) data. These
                        inputs will be consumed as-is by SNPE. Any input blob
                        not listed will be assumed to be image data.
  --model_version MODEL_VERSION
                        User-defined ASCII string to identify the model, only
                        first 64 bytes will be stored
  --reorder_list REORDER_LIST [REORDER_LIST ...]
                        A list of external inputs or outputs that SNPE should
                        automatically reorder to match the specified Caffe2
                        channel ordering. Note that this feature is only
                        enabled for the GPU runtime.
  --verbose             Verbose printing

snpe-diagview

snpe-diagview loads a DiagLog file generated by snpe-net-run whenever it operates on input tensor data. The DiagLog file contains timing information information for each layer as well as the entire forward propagate time. If the run uses an input list of input tensors, the timing info reported by snpe-diagview is an average over the entire input set.

The snpe-net-run generates a file called "SNPEDiag_0.log", "SNPEDiag_1.log" ... , "SNPEDiag_n.log", where n corresponds to the nth iteration of the snpe-net-run execution.

usage: snpe-diagview --input_log DIAG_LOG [-h] [--output CSV_FILE]

Reads a diagnostic log and output the contents to stdout

required arguments:
  --input_log     DIAG_LOG
                Diagnostic log file (required)
optional arguments:
  --output        CSV_FILE
                Output CSV file with all diagnostic data (optional)



snpe-dlc-info

snpe-dlc-info outputs layer information from a DLC file, which provides information about the network model.

usage: snpe-dlc-info [-h] -i INPUT_DLC [-s SAVE]

required arguments:
  -i INPUT_DLC, --input_dlc INPUT_DLC
                        path to a DLC file

optional arguments:
  -s SAVE, --save SAVE
                        Save the output to a csv file. Specify a target file path.



snpe-dlc-diff

snpe-dlc-diff compares two DLCs and by default outputs some of the following differences in them in a tabular format:

  • unique layers between the two DLCs
  • parameter differences in common layers
  • differences in dimensions of buffers associated with common layers
  • weight differences in common layers
  • output tensor names differences in common layers
  • unique records between the two DLCs (currently checks for AIP records only)
usage: snpe-dlc-diff [-h] -i1 INPUT_DLC_ONE -i2 INPUT_DLC_TWO [-c] [-l] [-p]
                     [-d] [-w] [-o] [-i] [-x] [-s SAVE]

required arguments:
  -i1 INPUT_DLC_ONE, --input_dlc_one INPUT_DLC_ONE
                        path to the first dl container archive
  -i2 INPUT_DLC_TWO, --input_dlc_two INPUT_DLC_TWO
                        path to the second dl container archive

optional arguments:
  -h, --help            show this help message and exit
  -c, --copyrights      compare copyrights between models
  -l, --layers          compare unique layers between models
  -p, --parameters      compare parameter differences between identically
                        named layers
  -d, --dimensions      compare dimension differences between identically
                        named layers
  -w, --weights         compare weight differences between identically named
                        layers.
  -o, --outputs         compare output_tensor name differences names between
                        identically named layers
  -i, --diff_by_id      Overrides the default comparison strategy for diffing
                        2 models components. By default comparison is made
                        between identically named layers. With this option the
                        models are ordered by id and diff is done in order as
                        long as no more than 1 consecutive layers have
                        different layer types.
  -x, --hta             compare HTA records differences in Models
  -s SAVE, --save SAVE  Save the output to a csv file. Specify a target file
                        path.



snpe-dlc-viewer

snpe-dlc-viewer visualizes the network structure of a DLC in a web browser.

usage: snpe-dlc-viewer [-h] -i INPUT_DLC [-s]

required arguments:
  -i INPUT_DLC, --input_dlc INPUT_DLC
                        Path to a DLC file

optional arguments:
  -s, --save            Save HTML file. Specify a file name and/or target save path
  -h, --help            Shows this help message and exits

Additional details:


The DLC viewer tool renders the specified network DLC in HTML format that may be viewed on a web browser.
On installations that support a native web browser a browser instance is opened on which the network is automatically rendered.
Users can optionally save the HTML content anywhere on their systems and open on a chosen web browser independently at a later time.

  • Features:
    • Graph-based representation of network model with nodes depicting layers and edges depicting buffer connections.
    • Colored legend to indicate layer types.
    • Zoom and drag options available for ease of visualization.
    • Tool-tips upon mouse hover to describe detailed layer parameters.
    • Sections showing metadata from DLC records
  • Supported browsers:
    • Google Chrome
    • Firefox
    • Internet Explorer on Windows
    • Microsoft Edge Browser on Windows
    • Safari on Mac

snpe-dlc-quantize

snpe-dlc-quantize converts non-quantized DLC models into quantized DLC models.

Command Line Options:
  [ -h,  --help ]       Displays this help message.
  [ --version ]         Displays version information.
  [ --verbose ]         Enable verbose user messages.
  [ --quiet ]           Disables some user messages.
  [ --silent ]          Disables all but fatal user messages.
  [ --debug=<val> ]     Sets the debug log level.
  [ --debug1 ]          Enables level 1 debug messages.
  [ --debug2 ]          Enables level 2 debug messages.
  [ --debug3 ]          Enables level 3 debug messages.
  [ --log-mask=<val> ]  Sets the debug log mask to set the log level for one or more areas.
                        Example: ".*=USER_ERROR, .*=INFO, NDK=DEBUG2, NCC=DEBUG3"
  [ --log-file=<val> ]  Overrides the default name for the debug log file.
  [ --log-dir=<val> ]   Overrides the default directory path where debug log files are written.
  [ --log-file-include-hostname ]
                        Appends the name of this host to the log file name.
  --input_dlc=<val>     Path to the dlc container containing the model for which fixed-point encoding
                        metadata should be generated. This argument is required.
  --input_list=<val>    Path to a file specifying the trial inputs. This file should be a plain text file,
                        containing one or more absolute file paths per line. These files will be taken to constitute
                        the trial set. Each path is expected to point to a binary file containing one trial input
                        in the 'raw' format, ready to be consumed by SNPE without any further modifications.
                        This is similar to how input is provided to snpe-net-run application.
  [ --no_weight_quantization ]
                        Generate and add the fixed-point encoding metadata but keep the weights in
                        floating point. This argument is optional.
  [ --output_dlc=<val> ]
                        Path at which the metadata-included quantized model container should be written.
                        If this argument is omitted, the quantized model will be written at <unquantized_model_name>_quantized.dlc.
  [ --enable_hta ]      Pack HTA information in quantized DLC.\n
  [ --hta_partitions=<val> ]
                        Specify subnet partitions to run on HTA.
                        Partitions are specified with start and end layer IDs, like 0-20.\n
                        Multiple partitions can also be specified as a comma-separated string, like 0-20,40-60,80-100\n
  [ --use_enhanced_quantizer ]
                        Use the enhanced quantizer feature when quantizing the model.  Regular quantization determines the range using the actual
                        values of min and max of the data being quantized.  Enhanced quantization uses an algorithm to determine optimal range.  It can be
                        useful for quantizing models that have long tails in the distribution of the data being quantized. Valid single argument options are "weights"
                        for quantized weights only, "activations" for quantized activations only, or leave blank to quantize both weights and activations.
  [ --optimizations ]   Use this option to enable new optimization algorithms. Usage is:
                        --optimizations <algo_name1> <algo_name2>
                        The available optimization algorithms are:
                        cle - Cross layer equalization includes a number of methods for equalizing weights and biases across layers in order to rectify imbalances that cause quantization errors.
                        bc - Bias correction adjusts biases to offset activation quantization errors. Typically used in conjunction with 'cle' to improve quantization accuracy.
  [ --use_symmetric_quantize_weights ]
                        Use the symmetric quantizer feature when quantizing the weights of the model. It makes sure min and max have the
                        same absolute values about zero, so that zero quantizes to 128.

Description:
Generate 8 bit TensorFlow style fixed point weight and activations encodings for a floating point SNPE model.


Additional details:

  • For specifying input_list, refer to input_list argument in snpe-net-run for supported input formats (in order to calculate output activation encoding information for all layers, do not include the line which specifies desired outputs).
  • The tool requires the batch dimension of the DLC input file to be set to 1 during the original model conversion step.
  • An example of quantization using snpe-dlc-quantize can be found in the C++ Tutorial section:Running the Inception v3 Model. For details on quantization see Quantized vs Non-Quantized Models.
  • Using snpe-dlc-quantize is mandatory for running on HTA. See Adding HTA sections



snpe-tensorflow-to-dlc

snpe-tensorflow-to-dlc converts a TensorFlow model into an SNPE DLC file.

usage: snpe-tensorflow-to-dlc [-h] [--input_network INPUT_NETWORK]
                              [-o OUTPUT_PATH]
                              [--copyright_file COPYRIGHT_FILE]
                              [--model_version MODEL_VERSION]
                              [--disable_batchnorm_folding]
                              [--input_type INPUT_NAME INPUT_TYPE]
                              [--input_encoding INPUT_NAME INPUT_ENCODING]
                              [--validation_target RUNTIME_TARGET PROCESSOR_TARGET]
                              [--strict] [--debug [DEBUG]] -d INPUT_NAME
                              INPUT_DIM --out_node OUT_NODE
                              [--allow_unconsumed_nodes]

Script to convert tensorflowmodel into a DLC file.

optional arguments:
  -h, --help            show this help message and exit

required arguments:
  --input_network INPUT_NETWORK, -i INPUT_NETWORK
                        Path to the source framework model.
  -d INPUT_NAME INPUT_DIM, --input_dim INPUT_NAME INPUT_DIM
                        The names and dimensions of the network input layers
                        specified in the format "input_name" comma-separated-
                        dimensions, for example: "data" 1,224,224,3. Note that
                        the quotes should always be included in order to
                        handle special characters, spaces, etc. For multiple
                        inputs specify multiple --input_dim on the command
                        line like: --input_dim "data1" 1,224,224,3 --input_dim
                        "data2" 1,50,100,3.
  --out_node OUT_NODE   Name of the graph's output node.

optional arguments:
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        Path where the converted Output model should be
                        saved.If not specified, the converter model will be
                        written to a file with same name as the input model
  --copyright_file COPYRIGHT_FILE
                        Path to copyright file. If provided, the content of
                        the file will be added to the output model.
  --model_version MODEL_VERSION
                        User-defined ASCII string to identify the model, only
                        first 64 bytes will be stored
  --disable_batchnorm_folding
                        If not specified, converter will try to fold batchnorm
                        into previous convolution layer
  --input_type INPUT_NAME INPUT_TYPE, -t INPUT_NAME INPUT_TYPE
                        Type of data expected by each input op/layer. Type for
                        each input is |default| if not specified. For example:
                        "data" image.Note that the quotes should always be
                        included in order to handle special characters,
                        spaces,etc. For multiple inputs specify multiple
                        --input_type on the command line. Eg: --input_type
                        "data1" image --input_type "data2" opaque These
                        options get used by DSP runtime and following
                        descriptions state how input will be handled for each
                        option. Image: input is float between 0-255 and the
                        input's mean is 0.0f and the input's max is 255.0f. We
                        will cast the float to uint8ts and pass the uint8ts to
                        the DSP. Default: pass the input as floats to the dsp
                        directly and the DSP will quantize it. Opaque: assumes
                        input is float because the consumer layer(i.e next
                        layer) requires it as float, therefore it won't be
                        quantized.Choices supported:['image', 'default',
                        'opaque']
  --input_encoding INPUT_NAME INPUT_ENCODING, -e INPUT_NAME INPUT_ENCODING
                        Image encoding of the source images. Default is bgr.
                        Eg usage: "data" rgba Note the quotes should always be
                        included in order to handle special characters,
                        spaces, etc. For multiple inputs specify
                        --input_encoding for each on the command line. Eg:
                        --input_encoding "data1" rgba --input_encoding "data2"
                        other. Use options: color encodings(bgr,rgb, nv21...)
                        if input is image; time_series: for inputs of rnn
                        models; other: if input doesn't follow above
                        categories or is unknown. Choices supported:['bgr',
                        'rgb', 'rgba', 'argb32', 'nv21', 'time_series',
                        'other']
  --validation_target RUNTIME_TARGET PROCESSOR_TARGET
                        A combination of processor and runtime target against
                        which model will be validated. Choices for
                        RUNTIME_TARGET: {cpu, gpu, dsp}. Choices for
                        PROCESSOR_TARGET: {snapdragon_801, snapdragon_820,
                        snapdragon_835}.If not specified, will validate model
                        against {snapdragon_820, snapdragon_835} across all
                        runtime targets.
  --strict              If specified, will validate in strict mode whereby
                        model will not be produced if it violates constraints
                        of the specified validation target. If not specified,
                        will validate model in permissive mode against the
                        specified validation target.
  --debug [DEBUG]       Run the converter in debug mode.
  --allow_unconsumed_nodes
                        Uses a relaxed graph node to layer mapping algorithm
                        which may not use all graph nodes during conversion
                        while retaining structural integrity.

Examples of using this script can be found in Converting Models from TensorFlow to SNPE.

Additional details:

  • input_network argument:
    • The converter supports either a single frozen graph .pb file or a pair of graph meta and checkpoint files.
    • If you are using the TensorFlow Saver to save your graph during training, 3 files will be generated as described below:
      1. <model-name>.meta
      2. <model-name>
      3. checkpoint
    • The converter --input_network option specifies the path to the graph meta file. The converter will also use the checkpoint file to read the graph nodes parameters during conversion. The checkpoint file must have the same name without the .meta suffix.
    • This argument is required.
  • input_dim argument:
    • Specifies the input dimensions of the graph's input node(s)
    • The converter requires a node name along with dimensions as input from which it will create an input layer by using the node output tensor dimensions. When defining a graph, there is typically a placeholder name used as input during training in the graph. The placeholder tensor name is the name you must use as the argument. It is also possible to use other types of nodes as input, however the node used as input will not be used as part of a layer other than the input layer.
    • Multiple Inputs
      • Networks with multiple inputs must provide --input_dim INPUT_NAME INPUT_DIM, one for each input node.
    • This argument is required.
  • out_node argument:
    • The name of the last node in your TensorFlow graph which will represent the output layer of your network.

    • Multiple Outputs
      • Networks with multiple outputs must provide several --out_node arguments, one for each output node.
  • output_path argument:
    • Specifies the output DLC file name.
    • This argument is optional. If not provided the converter will create a DLC file file with the same name as the graph file name, with a .dlc file extension.

snpe-onnx-to-dlc

snpe-onnx-to-dlc converts a serialized ONNX model into a SNPE DLC file.

usage: snpe-onnx-to-dlc [-h] [--input_network INPUT_NETWORK] [-o OUTPUT_PATH]
                        [--copyright_file COPYRIGHT_FILE]
                        [--model_version MODEL_VERSION]
                        [--disable_batchnorm_folding]
                        [--input_type INPUT_NAME INPUT_TYPE]
                        [--input_encoding INPUT_NAME INPUT_ENCODING]
                        [--validation_target RUNTIME_TARGET PROCESSOR_TARGET]
                        [--strict] [--debug [DEBUG]]
                        [--dry_run [DRY_RUN]]

Script to convert onnxmodel into a DLC file.

optional arguments:
  -h, --help            show this help message and exit

required arguments:
  --input_network INPUT_NETWORK, -i INPUT_NETWORK
                        Path to the source framework model.

optional arguments:
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        Path where the converted Output model should be
                        saved.If not specified, the converter model will be
                        written to a file with same name as the input model
  --copyright_file COPYRIGHT_FILE
                        Path to copyright file. If provided, the content of
                        the file will be added to the output model.
  --model_version MODEL_VERSION
                        User-defined ASCII string to identify the model, only
                        first 64 bytes will be stored
  --disable_batchnorm_folding
                        If not specified, converter will try to fold batchnorm
                        into previous convolution layer
  --input_type INPUT_NAME INPUT_TYPE, -t INPUT_NAME INPUT_TYPE
                        Type of data expected by each input op/layer. Type for
                        each input is |default| if not specified. For example:
                        "data" image.Note that the quotes should always be
                        included in order to handle special characters,
                        spaces,etc. For multiple inputs specify multiple
                        --input_type on the command line. Eg: --input_type
                        "data1" image --input_type "data2" opaque These
                        options get used by DSP runtime and following
                        descriptions state how input will be handled for each
                        option. Image: input is float between 0-255 and the
                        input's mean is 0.0f and the input's max is 255.0f. We
                        will cast the float to uint8ts and pass the uint8ts to
                        the DSP. Default: pass the input as floats to the dsp
                        directly and the DSP will quantize it. Opaque: assumes
                        input is float because the consumer layer(i.e next
                        layer) requires it as float, therefore it won't be
                        quantized.Choices supported:['image', 'default',
                        'opaque']
  --input_encoding INPUT_NAME INPUT_ENCODING, -e INPUT_NAME INPUT_ENCODING
                        Image encoding of the source images. Default is bgr.
                        Eg usage: "data" rgba Note the quotes should always be
                        included in order to handle special characters,
                        spaces, etc. For multiple inputs specify
                        --input_encoding for each on the command line. Eg:
                        --input_encoding "data1" rgba --input_encoding "data2"
                        other. Use options: color encodings(bgr,rgb, nv21...)
                        if input is image; time_series: for inputs of rnn
                        models; other: if input doesn't follow above
                        categories or is unknown. Choices supported:['bgr',
                        'rgb', 'rgba', 'argb32', 'nv21', 'time_series',
                        'other']
  --validation_target RUNTIME_TARGET PROCESSOR_TARGET
                        A combination of processor and runtime target against
                        which model will be validated. Choices for
                        RUNTIME_TARGET: {cpu, gpu, dsp}. Choices for
                        PROCESSOR_TARGET: {snapdragon_801, snapdragon_820,
                        snapdragon_835}.If not specified, will validate model
                        against {snapdragon_820, snapdragon_835} across all
                        runtime targets.
  --strict              If specified, will validate in strict mode whereby
                        model will not be produced if it violates constraints
                        of the specified validation target. If not specified,
                        will validate model in permissive mode against the
                        specified validation target.
  --debug [DEBUG]       Run the converter in debug mode.
  --dry_run [DRY_RUN]   Evaluates the model without actually converting any
                        ops, and returns unsupported ops/attributes as well as
                        unused inputs and/or outputs if any. Leave empty or
                        specify "info" to see dry run as a table, or specify
                        "debug" to show more detailed messages only"

For more information, see ONNX Model Conversion


snpe-platform-validator

DESCRIPTION:
------------
snpe-platform-validator checks the SNPE compatibility/capability of a device. This tool runs on the device,
rather than on the host, and requires a few additional files to be pushed to the device besides its own executable.
Additional details below.


REQUIRED ARGUMENTS:
-------------------
  --runtime <RUNTIME>   Specify the runtime to validate. <RUNTIME> : gpu, dsp, aip, all.

OPTIONAL ARGUMENTS:
-------------------
  --coreVersion         Query the runtime core descriptor.
  --libVersion          Query the runtime core library API.
  --testRuntime         Run diagnostic tests on the specified runtime.
  --targetPath <DIR>    The directory to save output on the device. Defaults to /data/local/tmp/platformValidator/output.
  --debug               Turn on verbose logging.
  --help                Show this help message.

Additional details:

  • File needed to be pushed to device:
    •    bin/snpe-platform-validator
         lib/libcalculator_domains.so
         lib/libcalculator.so
         lib/libsnpe_adsp.so
         lib/libsnpe_dsp_domains.so
         lib/libsnpe_dsp_domains_v2.so
         lib/libsnpe_dsp_domains_system.so
         lib/dsp/libcalculator_skel.so
         lib/dsp/libsnpe_dsp_domains_skel.so
         lib/dsp/libsnpe_dsp_skel.so
         lib/dsp/libsnpe_dsp_v65_domains_v2_skel.so
      
         example: for pushing arm-android-clang6.0 variant to /data/local/tmp/platformValidator
      
         adb push $SNPE_ROOT/bin/arm-android-clang6.0/snpe-platform-validator /data/local/tmp/platformValidator/bin/snpe-platform-validator
         adb push $SNPE_ROOT/lib/arm-android-clang6.0 /data/local/tmp/platformValidator/lib
         adb push $SNPE_ROOT/lib/dsp /data/local/tmp/platformValidator/dsp

snpe-throughput-net-run

snpe-throughput-net-run concurrently runs multiple instances of SNPE for a certain duration of time and measures inference throughput. Each instance of SNPE can have its own model, designated runtime and performance profile. Please note that the "--duration" parameter is common for all instances of SNPE created.

DESCRIPTION:
------------
Example application demonstrating how to load concurrent SNPE objects
using the SNPE C++ API.


REQUIRED ARGUMENTS:
-------------------
  --container  <FILE>   Path to the DL container containing the network.
  --duration   <VAL>    Duration of time (in seconds) to run network execution.
  --use_cpu             Use the CPU runtime for SNPE.
  --use_gpu             Use the GPU float32 runtime for SNPE.
  --use_gpu_fp16        Use the GPU float16 runtime for SNPE.
  --use_dsp             Use the DSP fixed point runtime for SNPE.
  --perf_profile <VAL>  Specifies perf profile to set. Valid settings are "balanced" , "default" , "high_performance" ,
                        "sustained_high_performance" , "burst" , "power_saver" and "system_settings".
                        NOTE: "balanced" and "default" are the same.  "default" is being deprecated in the future.
  --use_aip             Use the AIP fixed point runtime for SNPE


OPTIONAL ARGUMENTS:
-------------------
  --debug               Specifies that output from all layers of the network
                        will be saved.
                        will be saved.
  --storage_dir <DIR>   The directory to store SNPE metadata files
  --version             Show SNPE Version Number.
  --iterations <VAL>    Number of times to iterate through entire input list
  --verbose             Print more debug information.
  --enable_cpu_fallback Enables cpu fallback functionality. Defaults to disable mode.
  --help                Show this help message.

snpe-platform-validator-py

DESCRIPTION:
------------
snpe-platform-validator-py checks the SNPE compatibility/capability of a device. The output is saved in a CSV file in the
"Output" directory, in a csv format. Basic logs are also displayed on the console.

REQUIRED ARGUMENTS:
-------------------
  --runtime <RUNTIME>      Specify the runtime to validate. <RUNTIME> : gpu, dsp, aip, all.
  --directory <ARTIFACTS>  Path to the root of the unpacked SDK directory containing the executable and library files.

OPTIONAL ARGUMENTS:
-------------------
  --buildVariant <VARIANT>      Specify the build variant (e.g: arm-android-clang6.0(default), aarch64-android-clang6.0) to be validated.
  --deviceId                    Uses the device for running the adb command. Defaults to first device in the adb devices list.
  --coreVersion                 Outputs the version of the runtime that is present on the target.
  --libVersion                  Outputs the library version of the runtime that is present on the target.
  --testRuntime                 Runs a small program on the runtime and Checks if SNPE is supported for runtime.
  --targetPath <PATH>           The path to be used on the device. Defaults to /data/local/tmp/platformValidator
                                NOTE that this directory will be deleted before proceeding with validation.
  --remoteHost <REMOTEHOST>     Run on remote host through remote adb server. Defaults to localhost.
  --debug                       Set to turn on debug log.