Snapdragon Neural Processing Engine SDK
Reference Guide
|
This chapter describes the various SDK tools and features.
snpe-net-run loads a DLC file, loads the data for the input tensor(s), and executes the network on the specified runtime.
DESCRIPTION: ------------ Example application demonstrating how to load and execute a neural network using the SNPE C/C++ API. REQUIRED ARGUMENTS: ------------------- --container <FILE> Path to the DL container containing the network. --input_list <FILE> Path to a file listing the inputs for the network. OPTIONAL ARGUMENTS: ------------------- --use_gpu Use the GPU runtime for SNPE. --use_dsp Use the DSP fixed point runtime for SNPE. --debug Specifies that output from all layers of the network will be saved. --output_dir=<val> The directory to save output to. Defaults to ./output --storage_dir=<val> The directory to store SNPE metadata files --encoding_type=<val> Specifies the encoding type of input file. Valid settings are "nv21". Cannot be combined with --userbuffer*. --use_native_input_files Specifies to consume the input file(s) in their native data type(s). Must be used with --userbuffer_xxx. --use_native_output_files Specifies to write the output file(s) in their native data type(s). Must be used with --userbuffer_xxx. --userbuffer_auto Specifies to use userbuffer for input and output, with auto detection of types enabled. Must be used with user specified buffer. Cannot be combined with --encoding_type. --userbuffer_float Specifies to use userbuffer for inference, and the input type is float. Cannot be combined with --encoding_type. --userbuffer_floatN=<val> Specifies to use userbuffer for inference, and the input type is float 16 or float 32. Cannot be combined with --encoding_type. --userbuffer_tf8 Specifies to use userbuffer for inference, and the input type is tf8exact0. Cannot be combined with --encoding_type. --userbuffer_tfN=<val> Overrides the userbuffer output used for inference, and the output type is tf8exact0 or tf16exact0. Must be used with user specified buffer. --userbuffer_float_output Overrides the userbuffer output used for inference, and the output type is float. Must be used with user specified buffer. --userbuffer_floatN_output=<val> Overrides the userbuffer output used for inference, and the output type is float 16 or float 32. Must be used with user specified buffer. --userbuffer_tfN_output=<val> Overrides the userbuffer output used for inference, and the output type is tf8exact0 or tf16exact0. Must be used with user specified buffer. --userbuffer_tf8_output Overrides the userbuffer output used for inference, and the output type is tf8exact0. --userbuffer_uintN_output=<val> Overrides the userbuffer output used for inference, and the output type is Uint N. Must be used with user specified buffer. --static_min_max Specifies to use quantization parameters from the model instead of input specific quantization. Used in conjunction with --userbuffer_tf8. --resizable_dim=<val> Specifies the maximum number that resizable dimensions can grow into. Used as a hint to create UserBuffers for models with dynamic sized outputs. Should be a positive integer and is not applicable when using ITensor. --userbuffer_glbuffer [EXPERIMENTAL] Specifies to use userbuffer for inference, and the input source is OpenGL buffer. Cannot be combined with --encoding_type. GL buffer mode is only supported on Android OS. --data_type_map=<val> Sets data type of IO buffers during prepare. Arguments should be provided in the following format: --data_type_map buffer_name1=buffer_name1_data_type --data_type_map buffer_name2=buffer_name2_data_type Data Type can have the following values: float32, fixedPoint8, fixedPoint16 --tensor_mode=<val> Sets type of tensor to use. Arguments should be provided in the following format: --tensor_mode itensor Data Type can have the following values: userBuffer, itensor --perf_profile=<val> Specifies perf profile to set. Valid settings are "low_balanced" , "balanced" , "default", "high_performance" ,"sustained_high_performance", "burst", "low_power_saver", "power_saver", "high_power_saver" and "system_settings". --profiling_level=<val> Specifies the profiling level. Valid settings are "off", "basic", "moderate" and "detailed". Default is detailed. --enable_cpu_fallback Enables cpu fallback functionality. Defaults to disable mode. --input_name=<val> Specifies the name of input for which dimensions are specified. --input_dimensions=<val> Specifies new dimensions for input whose name is specified in input_name. e.g. "1,224,224,3". For multiple inputs, specify --input_name and --input_dimensions multiple times. --gpu_mode=<val> Specifies gpu operation mode. Valid settings are "default", "float16". default = float32 math and float16 storage (equiv. use_gpu arg). float16 = float16 math and float16 storage. --enable_init_cache Enable init caching mode to accelerate the network building process. Defaults to disable. --platform_options=<val> Specifies value to pass as platform options. --priority_hint=<val> Specifies hint for priority level. Valid settings are "low", "normal", "normal_high", "high". Defaults to normal. Note: "normal_high" is only available on DSP. --inferences_per_duration=<val> Specifies the number of inferences in specific duration (in seconds). e.g. "10,20". --runtime_order=<val> Specifies the order of precedence for runtime e.g cpu_float32, dsp_fixed8_tf etc Valid values are:- cpu_float32 (Snapdragon CPU) = Data & Math: float 32bit gpu_float32_16_hybrid (Adreno GPU) = Data: float 16bit Math: float 32bit dsp_fixed8_tf (Hexagon DSP) = Data & Math: 8bit fixed point Tensorflow style format gpu_float16 (Adreno GPU) = Data: float 16bit Math: float 16bit --set_output_tensors=<val> Specifies a comma separated list of tensors to be output after execution. --set_unconsumed_as_output Sets all unconsumed tensors as outputs. aip_fixed8_tf (Snapdragon HTA+HVX) = Data & Math: 8bit fixed point Tensorflow style format cpu (Snapdragon CPU) = Same as cpu_float32 gpu (Adreno GPU) = Same as gpu_float32_16_hybrid dsp (Hexagon DSP) = Same as dsp_fixed8_tf aip (Snapdragon HTA+HVX) = Same as aip_fixed8_tf --udo_package_path=<val> Path to the registration library for UDO package(s). Optionally, user can provide multiple packages as a comma-separated list. --duration=<val> Specified the duration of the run in seconds. Loops over the input_list until this amount of time has transpired. --dbglogs --timeout=<val> Execution terminated when exceeding time limit. Only valid for dsp runtime currently. --userlogs=<val> Specifies the user level logging as level,<optional logPath>. --help Show this help message. --version Show SNPE Version Number.
This binary outputs raw output tensors into the output folder by default. Examples of using snpe-net-run can be found in Running AlexNet tutorial.
Additional details:
snpe-net-run is able to automatically batch the input data. The batch size is indicated in the model container (DLC file) but can also be set using the "input_dimensions" argument passed to snpe-net-run. Users do not need to batch their input data. If the input data is not batch, the input size needs to be a multiple of the size of the input data files. snpe-net-run would group the provided inputs into batches and pad the incomplete batches (if present) with zeros.
In the example below, the model is set to accept batches of three inputs. So, the inputs are automatically grouped together to form batches by snpe-net-run and padding is done to the final batch. Note that there are five output files generated by snpe-net-run:
… Processing DNN input(s): cropped/notice_sign.raw cropped/trash_bin.raw cropped/plastic_cup.raw Processing DNN input(s): cropped/handicap_sign.raw cropped/chairs.raw Applying padding
snpe-net-run can take multiple input files as input data per iteration, and specify multiple output names, in an input list file formated as below:
#<output_name>[<space><output_name>] <input_layer_name>:=<input_layer_path>[<space><input_layer_name>:=<input_layer_path>] …
The first line starting with a "#" specifies the output layers' names. If there is more than one output, a whitespace should be used as a delimiter. Following the first line, you can use multiple lines to supply input files, one line per iteration, and each line only supply one layer.If there is more than one input per line, a whitespace should be used as a delimiter.
Here is an example, where the layer names are "Input_1" and "Input_2", and inputs are located in the path "Placeholder_1/real_input_inputs_1/". Its input list file should look like this:
#Output_1 Output_2 Input_1:=Placeholder_1/real_input_inputs_1/0-0#e6fb51.rawtensor Input_2:=Placeholder_1/real_input_inputs_1/0-1#8a171b.rawtensor Input_1:=Placeholder_1/real_input_inputs_1/1-0#67c965.rawtensor Input_2:=Placeholder_1/real_input_inputs_1/1-1#54f1ff.rawtensor Input_1:=Placeholder_1/real_input_inputs_1/2-0#b42dc6.rawtensor Input_2:=Placeholder_1/real_input_inputs_1/2-1#346a0e.rawtensor
Note: If the batch dimension of the model is greater than 1, the number of batch elements in the input file has to either match the batch dimension specified in the DLC or it has to be one. In the latter case, snpe-net-run will combine multiple lines into a single input tensor.
snpe-parallel-run loads a DLC file, loads the data for the input tensor(s), and executes the network on the specified runtime. This app is similar to snpe-net-run, but is able to run multiple threads of inference on the same network for benchmarking purposes.
DESCRIPTION: ------------ Example application demonstrating how to use SNPE using the PSNPE and SNPE C/C++ API. REQUIRED ARGUMENTS: ------------------- --container <FILE> Path to the DL container containing the network. --input_list <FILE> Path to a file listing the inputs for the network. --perf_profile <VAL> Specifies perf profile to set. Valid settings are "balanced" , "default" , "high_performance" , "sustained_high_performance" , "burst" , "power_saver" and "system_settings". NOTE: "balanced" and "default" are the same. "default" is being deprecated in the future. --cpu_fallback Enables cpu fallback functionality. Valid settings are "false", "true". --runtime_order <VAL,VAL,VAL,..> Specifies the order of precedence for runtime e.g cpu,gpu etc. Valid values are:- cpu_float32 (Snapdragon CPU) = Data & Math: float 32bit gpu_float32_16_hybrid (Adreno GPU) = Data: float 16bit Math: float 32bit dsp_fixed8_tf (Hexagon DSP) = Data & Math: 8bit fixed point Tensorflow style format gpu_float16 (Adreno GPU) = Data: float 16bit Math: float 16bit aip_fixed8_tf (Snapdragon HTA+HVX) = Data & Math: 8bit fixed point Tensorflow style format cpu (Snapdragon CPU) = Same as cpu_float32 gpu (Adreno GPU) = Same as gpu_float32_16_hybrid dsp (Hexagon DSP) = Same as dsp_fixed8_tf aip (Snapdragon HTA+HVX) = Same as aip_fixed8_tf --use_cpu Use the CPU runtime for SNPE. --use_gpu Use the GPU float32 runtime for SNPE. --use_gpu_fp16 Use the GPU float16 runtime for SNPE. --use_dsp Use the DSP fixed point runtime for SNPE. --use_aip Use the AIP fixed point runtime for SNPE. OPTIONAL ARGUMENTS: ------------------- --userbuffer_float Specifies to use userbuffer for inference, and the input type is float. --userbuffer_tf8 Specifies to use userbuffer for inference, and the input type is tf8exact0. --userbuffer_auto Specifies to use userbuffer with automatic input and output type detection for inference. --use_native_input_files Specifies to consume the input file(s) in their native data type(s). Must be used with --userbuffer_xxx. --use_native_output_files Specifies to write the output file(s) in their native data type(s). Must be used with --userbuffer_xxx. --input_name <INPUT_NAME> Specifies the name of input for which dimensions are specified. --input_dimensions <INPUT_DIM> Specifies new dimensions for input whose name is specified in input_name. e.g. "1,224,224,3". --output_dir <DIR> The directory to save result files --static_min_max Specifies to use quantization parameters from the model instead of input specific quantization. Used in conjunction with --userbuffer_tf8. --userbuffer_float_output Overrides the userbuffer output used for inference, and the output type is float. Must be used with user specified buffer. --userbuffer_tf8_output Overrides the userbuffer output used for inference, and the output type is tf8exact0. Must be used with user specified buffer. --enable_init_cache Enable init caching mode to accelerate the network building process. Defaults to disable. --profiling_level Specifies the profiling level. Valid settings are "off", "basic", "moderate" and "detailed".Default is off. --platform_options Specifies value to pass as platform options. Valid settings: "HtaDLBC:ON/OFF", "unsignedPD:ON/OFF". --set_output_tensors Specifies a comma separated list of tensors to be output after execution. --userlogs <VAL> Specifies the user level logging as level,<optional logPath>. --version Show SNPE Version Number. --help Show this help message.
Additional details:
For the required arguments pertaining to runtime specification, either –runtime_order OR use_cpu/gpu/etc. needs to be specified. The following example demonstrates an equivalent command using either of these options.
snpe-parallel-run --container container.dlc --input_list input_list.txt --perf_profile burst --cpu_fallback true --use_dsp --use_gpu --userbuffer_auto
is equivalent to
snpe-parallel-run --container container.dlc --input_list input_list.txt --perf_profile burst --cpu_fallback true --runtime_order dsp,gpu --userbuffer_auto
Spawning multiple threads:
snpe-parallel-run is able to create multiple threads to execute identical inference passes.
In the example below, the given command has the required arguments for container and input list given. After these 2 options, the remaining options form a repeating sequence that corresponds to each thread. In this example, we have varied the runtimes specified for each thread (one for dsp, another for gpu, and the last one for dsp).
snpe-parallel-run --container container.dlc --input_list input_list.txt --perf_profile burst --cpu_fallback true --use_dsp --userbuffer_auto --perf_profile burst --cpu_fallback true --use_gpu --userbuffer_auto --perf_profile burst --cpu_fallback true --use_dsp --userbuffer_auto
When this command is executed, the following section of output is observed:
... Processing DNN input(s): input.raw PSNPE start executing... runtimes: dsp_fixed8_tf gpu_float32_16_hybrid dsp_fixed8_tf - Mode :0- Number of images processed: x Build time: x seconds. ...
Note that the number of runtimes listed corresponds to the number of threads specified, as well as the order in which those threads were specified.
python script snpe_bench.py runs a DLC neural network and collects benchmark performance information.
usage: snpe_bench.py [-h] -c CONFIG_FILE [-o OUTPUT_BASE_DIR_OVERRIDE] [-v DEVICE_ID_OVERRIDE] [-r HOST_NAME] [-a] [-t DEVICE_OS_TYPE_OVERRIDE] [-d] [-s SLEEP] [-b USERBUFFER_MODE] [-p PERFPROFILE] [-l PROFILINGLEVEL] [-json] [-cache] Run the snpe_bench required arguments: -c CONFIG_FILE, --config_file CONFIG_FILE Path to a valid config file Refer to sample config file config_help.json for more detail on how to fill params in config file optional arguments: -o OUTPUT_BASE_DIR_OVERRIDE, --output_base_dir_override OUTPUT_BASE_DIR_OVERRIDE Sets the output base directory. -v DEVICE_ID_OVERRIDE, --device_id_override DEVICE_ID_OVERRIDE Use this device ID instead of the one supplied in config file. Cannot be used with -a -r HOST_NAME, --host_name HOST_NAME Hostname/IP of remote machine to which devices are connected. -a, --run_on_all_connected_devices_override Runs on all connected devices, currently only support 1. Cannot be used with -v -t DEVICE_OS_TYPE_OVERRIDE, --device_os_type_override DEVICE_OS_TYPE_OVERRIDE Specify the target OS type, valid options are ['android', 'android-aarch64', 'le', 'le64_gcc4.9', 'le_oe_gcc6.4', 'le64_oe_gcc6.4'] -d, --debug Set to turn on debug log -s SLEEP, --sleep SLEEP Set number of seconds to sleep between runs e.g. 20 seconds -b USERBUFFER_MODE, --userbuffer_mode USERBUFFER_MODE [EXPERIMENTAL] Enable user buffer mode, default to float, can be tf8exact0 -p PERFPROFILE, --perfprofile PERFPROFILE Set the benchmark operating mode (balanced, default, sustained_high_performance, high_performance, power_saver, system_settings) -l PROFILINGLEVEL, --profilinglevel PROFILINGLEVEL Set the profiling level mode (off, basic, moderate, detailed). Default is basic. -json, --generate_json Set to produce json output. -cache, --enable_init_cache Enable init caching mode to accelerate the network building process. Defaults to disable.
snpe-caffe-to-dlc converts a Caffe model into an SNPE DLC file.
usage: snpe-caffe-to-dlc [-h] [--input_network INPUT_NETWORK] [-o OUTPUT_PATH] [--out_node OUT_NAMES] [--copyright_file COPYRIGHT_FILE] [--model_version MODEL_VERSION] [--disable_batchnorm_folding] [--input_type INPUT_NAME INPUT_TYPE] [--input_dtype INPUT_NAME INPUT_DTYPE] [--input_encoding INPUT_NAME INPUT_ENCODING] [--input_layout INPUT_NAME INPUT_LAYOUT] [--udl UDL_MODULE FACTORY_FUNCTION] [--enable_preprocessing] [--quantization_overrides QUANTIZATION_OVERRIDES] [--keep_quant_nodes] [--keep_disconnected_nodes] [--validation_target RUNTIME_TARGET PROCESSOR_TARGET] [--strict] [--debug [DEBUG]] [-b CAFFE_BIN] [--udo_config_paths CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...]] Script to convert caffemodel into a DLC file. optional arguments: -h, --help show this help message and exit required arguments: --input_network INPUT_NETWORK, -i INPUT_NETWORK Path to the source framework model. optional arguments: --out_node OUT_NAMES, --out_name OUT_NAMES Name of the graph's output Tensor Names. Multiple output names should be provided separately like: --out_name out_1 --out_name out_2 -o OUTPUT_PATH, --output_path OUTPUT_PATH Path where the converted Output model should be saved.If not specified, the converter model will be written to a file with same name as the input model --copyright_file COPYRIGHT_FILE Path to copyright file. If provided, the content of the file will be added to the output model. --model_version MODEL_VERSION User-defined ASCII string to identify the model, only first 64 bytes will be stored --disable_batchnorm_folding If not specified, converter will try to fold batchnorm into previous convolution layer --input_type INPUT_NAME INPUT_TYPE, -t INPUT_NAME INPUT_TYPE Type of data expected by each input op/layer. Type for each input is |default| if not specified. For example: "data" image.Note that the quotes should always be included in order to handle special characters, spaces,etc. For multiple inputs specify multiple --input_type on the command line. Eg: --input_type "data1" image --input_type "data2" opaque These options get used by DSP runtime and following descriptions state how input will be handled for each option. Image: input is float between 0-255 and the input's mean is 0.0f and the input's max is 255.0f. We will cast the float to uint8ts and pass the uint8ts to the DSP. Default: pass the input as floats to the dsp directly and the DSP will quantize it. Opaque: assumes input is float because the consumer layer(i.e next layer) requires it as float, therefore it won't be quantized.Choices supported:['image', 'default', 'opaque'] --input_dtype INPUT_NAME INPUT_DTYPE The names and datatype of the network input layers specified in the format [input_name datatype], for example: 'data' 'float32'. Default is float32 if not specified. Note that the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify multiple --input_dtype on the command line like: --input_dtype 'data1' 'float32' --input_dtype 'data2' 'float32' --input_encoding INPUT_NAME INPUT_ENCODING, -e INPUT_NAME INPUT_ENCODING Image encoding of the source images. Default is bgr. Eg usage: "data" rgba Note the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify --input_encoding for each on the command line. Eg: --input_encoding "data1" rgba --input_encoding "data2" other. Use options: color encodings(bgr,rgb, nv21...) if input is image; time_series: for inputs of rnn models; other: if input doesn't follow above categories or is unknown. Choices supported:['bgr', 'rgb', 'rgba', 'argb32', 'nv21', 'time_series', 'other'] --input_layout INPUT_NAME INPUT_LAYOUT, -l INPUT_NAME INPUT_LAYOUT Layout of each input tensor. If not specified, it will use the default based on the Source Framework, shape of input and input encoding. Accepted values are- NCDHW, NDHWC, NCHW, NHWC, NFC, NCF, NTF, TNF, NF, NC, F, NONTRIVIAL N = Batch, C = Channels, D = Depth, H = Height, W = Width, F = Feature, T = Time NDHWC/NCDHW used for 5d inputs NHWC/NCHW used for 4d image-like inputs NFC/NCF used for inputs to Conv1D or other 1D ops NTF/TNF used for inputs with time steps like the ones used for LSTM op NF used for 2D inputs, like the inputs to Dense/FullyConnected layers NC used for 2D inputs with 1 for batch and other for Channels (rarely used) F used for 1D inputs, e.g. Bias tensor NONTRIVIAL for everything elseFor multiple inputs specify multiple --input_layout on the command line. Eg: --input_layout "data1" NCHW --input_layout "data2" NCHW --udl UDL_MODULE FACTORY_FUNCTION Option to add User Defined Layers. Provide Filename, Function name.1.Filename: Name of python module to load for registering custom udl(note: must be in PYTHONPATH). If file part of package list the package.filename as you would when doing a python import.2.Function name: Name of the udl factory function that return a dictionary of key layer type and value function callback. --enable_preprocessing If specified, converter will enable preprocessing specified by a datalayer transform_param subtract_mean is supported. --keep_disconnected_nodes Disable Optimization that removes Ops not connected to the main graph. This optimization uses output names provided over commandline OR inputs/outputs extracted from the Source model to determine the main graph --validation_target RUNTIME_TARGET PROCESSOR_TARGET A combination of processor and runtime target against which model will be validated. Choices for RUNTIME_TARGET: {cpu, gpu, dsp}. Choices for PROCESSOR_TARGET: {snapdragon_801, snapdragon_820, snapdragon_835}.If not specified, will validate model against {snapdragon_820, snapdragon_835} across all runtime targets. --strict If specified, will validate in strict mode whereby model will not be produced if it violates constraints of the specified validation target. If not specified, will validate model in permissive mode against the specified validation target. --debug [DEBUG] Run the converter in debug mode. -b CAFFE_BIN, --caffe_bin CAFFE_BIN Input caffe binary file containing the weight data --udo_config_paths CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...], -udo CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...] Path to the UDO configs (space separated, if multiple) Quantizer Options: --quantization_overrides QUANTIZATION_OVERRIDES Use this option to specify a json file with parameters to use for quantization. These will override any quantization data carried from conversion (eg TF fake quantization) or calculated during the normal quantization process. Format defined as per AIMET specification. --keep_quant_nodes Use this option to keep activation quantization nodes in the graph rather than stripping them.
Examples of using this script can be found in Converting Models from Caffe to SNPE.
Additional details:
snpe-diagview loads a DiagLog file generated by snpe-net-run whenever it operates on input tensor data. The DiagLog file contains timing information information for each layer as well as the entire forward propagate time. If the run uses an input list of input tensors, the timing info reported by snpe-diagview is an average over the entire input set.
The snpe-net-run generates a file called "SNPEDiag_0.log", "SNPEDiag_1.log" ... , "SNPEDiag_n.log", where n corresponds to the nth iteration of the snpe-net-run execution.
usage: snpe-diagview --input_log DIAG_LOG [-h] [--output CSV_FILE] Reads a diagnostic log and output the contents to stdout required arguments: --input_log DIAG_LOG Diagnostic log file (required) optional arguments: --output CSV_FILE Output CSV file with all diagnostic data (optional)
snpe-dlc-info outputs layer information from a DLC file, which provides information about the network model.
usage: snpe-dlc-info [-h] -i INPUT_DLC [-s SAVE] required arguments: -i INPUT_DLC, --input_dlc INPUT_DLC path to a DLC file optional arguments: -s SAVE, --save SAVE Save the output to a csv file. Specify a target file path.
snpe-dlc-diff compares two DLCs and by default outputs some of the following differences in them in a tabular format:
usage: snpe-dlc-diff [-h] -i1 INPUT_DLC_ONE -i2 INPUT_DLC_TWO [-c] [-l] [-p] [-d] [-w] [-o] [-i] [-x] [-s SAVE] required arguments: -i1 INPUT_DLC_ONE, --input_dlc_one INPUT_DLC_ONE path to the first dl container archive -i2 INPUT_DLC_TWO, --input_dlc_two INPUT_DLC_TWO path to the second dl container archive optional arguments: -h, --help show this help message and exit -c, --copyrights compare copyrights between models -l, --layers compare unique layers between models -p, --parameters compare parameter differences between identically named layers -d, --dimensions compare dimension differences between identically named layers -w, --weights compare weight differences between identically named layers. -o, --outputs compare output_tensor name differences names between identically named layers -i, --diff_by_id Overrides the default comparison strategy for diffing 2 models components. By default comparison is made between identically named layers. With this option the models are ordered by id and diff is done in order as long as no more than 1 consecutive layers have different layer types. -x, --hta compare HTA records differences in Models -s SAVE, --save SAVE Save the output to a csv file. Specify a target file path.
snpe-dlc-viewer visualizes the network structure of a DLC in a web browser.
usage: snpe-dlc-viewer [-h] -i INPUT_DLC [-s] required arguments: -i INPUT_DLC, --input_dlc INPUT_DLC Path to a DLC file optional arguments: -s, --save Save HTML file. Specify a file name and/or target save path -h, --help Shows this help message and exits
Additional details:
The DLC viewer tool renders the specified network DLC in HTML format that may be viewed on a web browser.
On installations that support a native web browser a browser instance is opened on which the network is automatically rendered.
Users can optionally save the HTML content anywhere on their systems and open on a chosen web browser independently at a later time.
snpe-dlc-quantize converts non-quantized DLC models into quantized DLC models.
Command Line Options: [ -h,--help ] Displays this help message. [ --version ] Displays version information. [ --verbose ] Enable verbose user messages. [ --quiet ] Disables some user messages. [ --silent ] Disables all but fatal user messages. [ --debug=<val> ] Sets the debug log level. [ --debug1 ] Enables level 1 debug messages. [ --debug2 ] Enables level 2 debug messages. [ --debug3 ] Enables level 3 debug messages. [ --log-mask=<val> ] Sets the debug log mask to set the log level for one or more areas. Example: ".*=USER_ERROR, .*=INFO, NDK=DEBUG2, NCC=DEBUG3" [ --log-file=<val> ] Overrides the default name for the debug log file. [ --log-dir=<val> ] Overrides the default directory path where debug log files are written. [ --log-file-include-hostname ] Appends the name of this host to the log file name. [ --input_dlc=<val> ] Path to the dlc container containing the model for which fixed-point encoding metadata should be generated. This argument is required. [ --input_list=<val> ] Path to a file specifying the trial inputs. This file should be a plain text file, containing one or more absolute file paths per line. These files will be taken to constitute the trial set. Each path is expected to point to a binary file containing one trial input in the 'raw' format, ready to be consumed by SNPE without any further modifications. This is similar to how input is provided to snpe-net-run application. [ --no_weight_quantization ] Generate and add the fixed-point encoding metadata but keep the weights in floating point. This argument is optional. [ --output_dlc=<val> ] Path at which the metadata-included quantized model container should be written. If this argument is omitted, the quantized model will be written at <unquantized_model_name>_quantized.dlc. [ --enable_htp ] Pack HTP information in quantized DLC. [ --htp_socs=<val> ] Specify SoC to generate HTP Offline Cache for. SoCs are specified with an ASIC identifier, in a comma separated list. For example, --htp_socs sm8550 [ --overwrite_cache_records ] Overwrite HTP cache records present in the DLC. [ --use_float_io ] Pack HTP information in quantized DLC (Note: deprecated). [ --use_enhanced_quantizer ] Use the enhanced quantizer feature when quantizing the model. Regular quantization determines the range using the actual values of min and max of the data being quantized. Enhanced quantization uses an algorithm to determine optimal range. It can be useful for quantizing models that have long tails in the distribution of the data being quantized. [ --use_adjusted_weights_quantizer ] Use the adjusted tf quantizer for quantizing the weights only. This might be helpful for improving the accuracy of some models, such as denoise model as being tested. This option is only used when quantizing the weights with 8 bit. [ --optimizations ] Use this option to enable new optimization algorithms. Usage is: --optimizations <algo_name1> --optimizations <algo_name2> The available optimization algorithms are: cle - Cross layer equalization includes a number of methods for equalizing weights and biases across layers in order to rectify imbalances that cause quantization errors. bc - Bias correction adjusts biases to offset activation quantization errors. Typically used in conjunction with 'cle' to improve quantization accuracy (Note: deprecated). [ --override_params ] Use this option to override quantization parameters when quantization was provided from the original source framework (eg TF fake quantization) [ --use_encoding_optimizations ] Use this option to enable quantization encoding optimizations. This can reduce requantization in the graph and may improve accuracy for some models (Note: this flag can be passed in, but is a no-op. Recognition of this flag will be removed in the future). [ --use_symmetric_quantize_weights ] Use the symmetric quantizer feature when quantizing the weights of the model. It makes sure min and max have the same absolute values about zero. Symmetrically quantized data will also be stored as int#_t data such that the offset is always 0. [ --bias_bitwidth=<val> ] Use the --bias_bitdwith option to select the bitwidth to use when quantizing the biases, either 8 (default) or 32. Using 32 bit biases may sometimes provide a small improvement in accuracy. Can't mix with --bitwidth. [ --act_bitwidth=<val> ] Use the --act_bitwidth option to select the bitwidth to use when quantizing the activations, either 8 (default) or 16. 8w/16a is only supported by the HTA currently. Can't mix with --bitwidth. [ --weights_bitwidth=<val> ] Use the --weights_bitwidth option to select the bitwidth to use when quantizing the weights, either 8 (default) or 16. 8w/16a is only supported by the HTA currently. Can't mix with --bitwidth. [ --bitwidth=<val> ] Use the --bitwidth option to select the bitwidth to use when quantizing the weights/activation/bias, either 8 (default) or 16. Can't mix with --weights_bitwidth or --act_bitwidth or --bias_bitdwith. [ --udo_package_path=<val> ] Use this option to specify path to the Registration Library for a UDO Package. Usage is: --udo_package_path=<path_to_reg_lib> This option must be specified for Networks with UDO. All UDO's in Network must have host-executable CPU Implementation Description: Generate 8 or 16 bit TensorFlow style fixed point weight and activations encodings for a floating point SNPE model.
Additional details:
#<output_layer_name>[<space><output_layer_name>] %<output_tensor_name>[<space><output_tensor_name>] <input_layer_name>:=<input_layer_path>[<space><input_layer_name>:=<input_layer_path>] …Note: Output tensors and layers can be specified individually, but when specifying both, the order shown must be used to specify each.
snpe-dlc-quant converts non-quantized DLC models into quantized DLC models.
Command Line Options: [ -h,--help ] Displays this help message. [ --version ] Displays version information. [ --verbose ] Enable verbose user messages. [ --quiet ] Disables some user messages. [ --silent ] Disables all but fatal user messages. [ --debug=<val> ] Sets the debug log level. [ --debug1 ] Enables level 1 debug messages. [ --debug2 ] Enables level 2 debug messages. [ --debug3 ] Enables level 3 debug messages. [ --log-mask=<val> ] Sets the debug log mask to set the log level for one or more areas. Example: ".*=USER_ERROR, .*=INFO, NDK=DEBUG2, NCC=DEBUG3" [ --log-file=<val> ] Overrides the default name for the debug log file. [ --log-dir=<val> ] Overrides the default directory path where debug log files are written. [ --log-file-include-hostname ] Appends the name of this host to the log file name. [ --input_dlc=<val> ] Path to the dlc container containing the model for which fixed-point encoding metadata should be generated. This argument is required. [ --input_list=<val> ] Path to a file specifying the trial inputs. This file should be a plain text file, containing one or more absolute file paths per line. These files will be taken to constitute the trial set. Each path is expected to point to a binary file containing one trial input in the 'raw' format, ready to be consumed by SNPE without any further modifications. This is similar to how input is provided to snpe-net-run application. [ --no_weight_quantization ] Generate and add the fixed-point encoding metadata but keep the weights in floating point. This argument is optional. [ --output_dlc=<val> ] Path at which the metadata-included quantized model container should be written. If this argument is omitted, the quantized model will be written at <unquantized_model_name>_quantized.dlc. [ --use_enhanced_quantizer ] Use the enhanced quantizer feature when quantizing the model. Regular quantization determines the range using the actual values of min and max of the data being quantized. Enhanced quantization uses an algorithm to determine optimal range. It can be useful for quantizing models that have long tails in the distribution of the data being quantized. [ --use_adjusted_weights_quantizer ] Use the adjusted tf quantizer for quantizing the weights only. This might be helpful for improving the accuracy of some models, such as denoise model as being tested. This option is only used when quantizing the weights with 8 bit. [ --optimizations ] Use this option to enable new optimization algorithms. Usage is: --optimizations <algo_name1> --optimizations <algo_name2> The available optimization algorithms are: cle - Cross layer equalization includes a number of methods for equalizing weights and biases across layers in order to rectify imbalances that cause quantization errors. [ --override_params ] Use this option to override quantization parameters when quantization was provided from the original source framework (eg TF fake quantization) [ --use_encoding_optimizations ] Use this option to enable quantization encoding optimizations. This can reduce requantization in the graph and may improve accuracy for some models (Note: deprecated). [ --use_symmetric_quantize_weights ] Use the symmetric quantizer feature when quantizing the weights of the model. It makes sure min and max have the same absolute values about zero. Symmetrically quantized data will also be stored as int#_t data such that the offset is always 0. [ --bias_bitwidth=<val> ] Use the --bias_bitdwith option to select the bitwidth to use when quantizing the biases, either 8 (default) or 32. Using 32 bit biases may sometimes provide a small improvement in accuracy. Can't mix with --bitwidth. [ --act_bitwidth=<val> ] Use the --act_bitwidth option to select the bitwidth to use when quantizing the activations, either 8 (default) or 16. 8w/16a is only supported by the HTA currently. Can't mix with --bitwidth. [ --weights_bitwidth=<val> ] Use the --weights_bitwidth option to select the bitwidth to use when quantizing the weights, either 8 (default) or 16. 8w/16a is only supported by the HTA currently. Can't mix with --bitwidth. [ --bitwidth=<val> ] Use the --bitwidth option to select the bitwidth to use when quantizing the weights/activation/bias, either 8 (default) or 16. Can't mix with --weights_bitwidth or --act_bitwidth or --bias_bitdwith. [ --udo_package_path=<val> ] Use this option to specify path to the Registration Library for a UDO Package. Usage is: --udo_package_path=<path_to_reg_lib> This option must be specified for Networks with UDO. All UDO's in Network must have host-executable CPU Implementation Description: Generate 8 or 16 bit TensorFlow style fixed point weight and activations encodings for a floating point SNPE model.
Additional details:
#<output_layer_name>[<space><output_layer_name>] %<output_tensor_name>[<space><output_tensor_name>] <input_layer_name>:=<input_layer_path>[<space><input_layer_name>:=<input_layer_path>] …Note: Output tensors and layers can be specified individually, but when specifying both, the order shown must be used to specify each.
snpe-dlc-graph-prepare is used to perform offline graph preparation on quantized dlcs to run on DSP/HTP runtimes.
Command Line Options: [ -h, --help ] Displays this help message. [ --version ] Displays version information. [ --verbose ] Enable verbose user messages. [ --quiet ] Disables some user messages. [ --silent ] Disables all but fatal user messages. [ --debug=<val> ] Sets the debug log level. [ --debug1 ] Enables level 1 debug messages. [ --debug2 ] Enables level 2 debug messages. [ --debug3 ] Enables level 3 debug messages. [ --log-mask=<val> ] Sets the debug log mask to set the log level for one or more areas. Example: ".*=USER_ERROR, .*=INFO, NDK=DEBUG2, NCC=DEBUG3" [ --log-file=<val> ] Overrides the default name for the debug log file. [ --log-dir=<val> ] Overrides the default directory path where debug log files are written. [ --log-file-include-hostname ] Appends the name of this host to the log file name. [ --input_dlc ] Path to the dlc container containing the model for which graph cache should be generated. This argument is required. [ --output_dlc ] Path at which the cached data included model container should be written. If this argument is omitted, the quantized model will be written at InputModelName_cached.dlc. [ --set_output_tensors ] Specifies a comma separated list of tensors to be output after execution without whitespace. [ --set_output_layers ] Specifies a comma separated list of layers whose output buffers should be output after execution, without whitespace. [ --input_list ] Path to a file specifying the trial inputs. This file should be a plain text file, containing one or more absolute file paths per line. These files will be taken to constitute the trial set. Each path is expected to point to a binary file containing one trial input in the 'raw' format, ready to be consumed by SNPE without any further modifications. This is similar to how input is provided to snpe-net-run application. [ --htp_socs ] Specify SoC(s) to generate HTP Offline Cache for. SoCs are specified with an ASIC identifier, in a comma seperated list without whitespace. For example --htp_socs sm8350,sm8450,sm8550. This flag and --htp_archs are mutually exclusive [ --htp_archs ] Specify DSP Architecture(s) to generate general HTP Offline Cache for. Architectures are specified with an ASIC identifier, in a comma seperated list without whitespace. For example, --htp_archs v68,v73. This flag cannot be coupled with --htp_socs or --vtcm_override [ --vtcm_override ] Specify a single value representing the VTCM size in MB for the generated HTP Offline Caches. For example, --vtcm_override 4. This flag can be used in conjunction with --htp_socs to override the default SOC vtcm size setting [ --buffer_data_type ] Sets data type of IO buffers during prepare. Data Type can be the following: float32, fixedPoint8, fixedPoint16. Arguments should be formatted as follows: --buffer_data_type buffer_name1=buffer_name1_data_type --buffer_data_type buffer_name2=buffer_name2_data_type (Note: deprecated) [ --overwrite_cache_records ] Erase all HTP cache records present in the DLC before generating requested caches [ --use_float_io ] Prepare quantized HTP Graph to operate with floating point inputs/outputs (Note: deprecated) [ --udo_package_path ] Use this option to specify path to the Registration Library for UDO Package(s). Usage is: --udo_package_path=<path_to_reg_lib> Optionally, user can provide multiple packages as a comma-separated list. This option must be specified for Networks with UDO. All UDO's in Network must have host executable CPU Implementation
snpe-tensorflow-to-dlc converts a TensorFlow model into an SNPE DLC file.
usage: snpe-tensorflow-to-dlc -d INPUT_NAME INPUT_DIM --out_node OUT_NAMES [--input_type INPUT_NAME INPUT_TYPE] [--input_dtype INPUT_NAME INPUT_DTYPE] [--input_encoding INPUT_NAME INPUT_ENCODING] [--debug [DEBUG]] [--input_layout INPUT_NAME INPUT_LAYOUT] [--udo_config_paths UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...]] [--show_unconsumed_nodes] [--saved_model_tag SAVED_MODEL_TAG] [--saved_model_signature_key SAVED_MODEL_SIGNATURE_KEY] [--disable_batchnorm_folding] [--quantization_overrides QUANTIZATION_OVERRIDES] [--keep_quant_nodes] [--keep_disconnected_nodes] --input_network INPUT_NETWORK [-h] [-o OUTPUT_PATH] [--copyright_file COPYRIGHT_FILE] [--model_version MODEL_VERSION] [--validation_target RUNTIME_TARGET PROCESSOR_TARGET] [--strict] [--udo_config_paths CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...]] Script to convert tensorflow model into a DLC file. optional arguments: -h, --help show this help message and exit required arguments: -d INPUT_NAME INPUT_DIM, --input_dim INPUT_NAME INPUT_DIM The names and dimensions of the network input layers specified in the format [input_name comma-separated- dimensions], for example: 'data' 1,224,224,3 Note that the quotes should always be included in order to handlespecial characters, spaces, etc. For multiple inputs specify multiple --input_dim on the command line like: --input_dim 'data1' 1,224,224,3 --input_dim 'data2' 1,50,100,3 --out_node OUT_NODE Name of the graph's output nodes. Multiple output nodes should be provided separately like: --out_node out_1 --out_node out_2 --input_network INPUT_NETWORK, -i INPUT_NETWORK Path to the source framework model. optional arguments: --input_type INPUT_NAME INPUT_TYPE, -t INPUT_NAME INPUT_TYPE Type of data expected by each input op/layer. Type for each input is |default| if not specified. For example: "data" image.Note that the quotes should always be included in order to handle special characters, spaces,etc. For multiple inputs specify multiple --input_type on the command line. Eg: --input_type "data1" image --input_type "data2" opaque These options get used by DSP runtime and following descriptions state how input will be handled for each option. Image: Input is float between 0-255 and the input's mean is 0.0f and the input's max is 255.0f. We will cast the float to uint8ts and pass the uint8ts to the DSP. Default: Pass the input as floats to the dsp directly and the DSP will quantize it. Opaque: Assumes input is float because the consumer layer(i.e next layer) requires it as float, therefore it won't be quantized. Choices supported: image default opaque --input_dtype INPUT_NAME INPUT_DTYPE The names and datatype of the network input layers specified in the format [input_name datatype], for example: 'data' 'float32'. Default is float32 if not specified. Note that the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify multiple --input_dtype on the command line like: --input_dtype 'data1' 'float32' --input_dtype 'data2' 'float32' --input_encoding INPUT_NAME INPUT_ENCODING, -e INPUT_NAME INPUT_ENCODING Image encoding of the source images. Default is bgr. Eg usage: "data" rgba Note the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify --input_encoding for each on the command line. Eg: --input_encoding "data1" rgba --input_encoding "data2" other Use options: color encodings(bgr,rgb, nv21...) if input is image; time_series: for inputs of rnn models; other: if input doesn't follow above categories or is unknown. Choices supported: bgr rgb rgba argb32 nv21 time_series other --debug [DEBUG] Run the converter in debug mode. --input_layout INPUT_NAME INPUT_LAYOUT, -l INPUT_NAME INPUT_LAYOUT Layout of each input tensor. If not specified, it will use the default based on the Source Framework, shape of input and input encoding. Accepted values are- NCDHW, NDHWC, NCHW, NHWC, NFC, NCF, NTF, TNF, NF, NC, F, NONTRIVIAL N = Batch, C = Channels, D = Depth, H = Height, W = Width, F = Feature, T = Time NDHWC/NCDHW used for 5d inputs NHWC/NCHW used for 4d image-like inputs NFC/NCF used for inputs to Conv1D or other 1D ops NTF/TNF used for inputs with time steps like the ones used for LSTM op NF used for 2D inputs, like the inputs to Dense/FullyConnected layers NC used for 2D inputs with 1 for batch and other for Channels (rarely used) F used for 1D inputs, e.g. Bias tensor NONTRIVIAL for everything elseFor multiple inputs specify multiple --input_layout on the command line. Eg: --input_layout "data1" NCHW --input_layout "data2" NCHW --udo_config_paths UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...], -udo UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...] Path to the UDO configs (space separated, if multiple) --show_unconsumed_nodes Displays a list of unconsumed nodes, if there any are found. Nodeswhich are unconsumed do not violate the structural fidelity of thegenerated graph. --saved_model_tag SAVED_MODEL_TAG Specify the tag to seletet a MetaGraph from savedmodel. ex: --saved_model_tag serve. Default value will be 'serve' when it is not assigned. --saved_model_signature_key SAVED_MODEL_SIGNATURE_KEY Specify signature key to select input and output of the model. ex: --saved_model_signature_key serving_default. Default value will be 'serving_default' when it is not assigned --disable_batchnorm_folding If not specified, converter will try to fold batchnorm into previous layer. --keep_disconnected_nodes Disable Optimization that removes Ops not connected to the main graph. This optimization uses output names provided over commandline OR inputs/outputs extracted from the Source model to determine the main graph -h, --help show this help message and exit -o OUTPUT_PATH, --output_path OUTPUT_PATH Path where the converted Output model should be saved.If not specified, the converter model will be written to a file with same name as the input model --copyright_file COPYRIGHT_FILE Path to copyright file. If provided, the content of the file will be added to the output model. --model_version MODEL_VERSION User-defined ASCII string to identify the model, only first 64 bytes will be stored --validation_target RUNTIME_TARGET PROCESSOR_TARGET A combination of processor and runtime target against which model will be validated. Choices for RUNTIME_TARGET: {cpu, gpu, dsp}. Choices for PROCESSOR_TARGET: {snapdragon_801, snapdragon_820, snapdragon_835}. If not specified, will validate model against {snapdragon_820, snapdragon_835} across all runtime targets. --strict If specified, will validate in strict mode whereby model will not be produced if it violates constraints of the specified validation target. If not specified, will validate model in permissive mode against the specified validation target. --udo_config_paths CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...], -udo CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...] Path to the UDO configs (space separated, if multiple) Quantizer Options: --quantization_overrides QUANTIZATION_OVERRIDES Use this option to specify a json file with parameters to use for quantization. These will override any quantization data carried from conversion (eg TF fake quantization) or calculated during the normal quantization process. Format defined as per AIMET specification. --keep_quant_nodes Use this option to keep activation quantization nodes in the graph rather than stripping them.
Examples of using this script can be found in Converting Models from TensorFlow to SNPE.
Additional details:
The name of the last node in your TensorFlow graph which will represent the output layer of your network.
snpe-tflite-to-dlc converts a TFLite model into a SNPE DLC file.
usage: snpe-tflite-to-dlc -d INPUT_NAME INPUT_DIM [--input_dtype INPUT_NAME INPUT_DTYPE] [--out_node OUT_NODE] [--input_type INPUT_NAME INPUT_TYPE] [--input_dtype INPUT_NAME INPUT_DTYPE] [--input_encoding INPUT_NAME INPUT_ENCODING] [--debug [DEBUG]] [--input_layout INPUT_NAME INPUT_LAYOUT] [--udo_config_paths UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...]] [--dump_relay DUMP_RELAY] [--disable_batchnorm_folding] [--quantization_overrides QUANTIZATION_OVERRIDES] [--keep_quant_nodes] [--keep_disconnected_nodes] --input_network INPUT_NETWORK [-h] [-o OUTPUT_PATH] [--copyright_file COPYRIGHT_FILE] [--model_version MODEL_VERSION] [--validation_target RUNTIME_TARGET PROCESSOR_TARGET] [--strict] [--udo_config_paths CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...]] Script to convert TFLite model into DLC required arguments: -d INPUT_NAME INPUT_DIM, --input_dim INPUT_NAME INPUT_DIM The names and dimensions of the network input layers specified in the format [input_name comma-separated- dimensions], for example: 'data' 1,224,224,3 Note that the quotes should always be included in order to handlespecial characters, spaces, etc. For multiple inputs specify multiple --input_dim on the command line like: --input_dim 'data1' 1,224,224,3 --input_dim 'data2' 1,50,100,3 --input_network INPUT_NETWORK, -i INPUT_NETWORK Path to the source framework model. optional arguments: --input_dtype INPUT_NAME INPUT_DTYPE The names and datatype of the network input layers specified in the format [input_name datatype], for example: 'data' 'float32'. Default is float32 if not specified. Note that the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify multiple --input_dtype on the command line like: --input_dtype 'data1' 'float32' --input_dtype 'data2' 'float32' --out_node OUT_NODE Name of the graph's output nodes. Multiple output nodes should be provided separately like: --out_node out_1 --out_node out_2 --input_type INPUT_NAME INPUT_TYPE, -t INPUT_NAME INPUT_TYPE Type of data expected by each input op/layer. Type for each input is |default| if not specified. For example: "data" image.Note that the quotes should always be included in order to handle special characters, spaces,etc. For multiple inputs specify multiple --input_type on the command line. Eg: --input_type "data1" image --input_type "data2" opaque These options get used by DSP runtime and following descriptions state how input will be handled for each option. Image: Input is float between 0-255 and the input's mean is 0.0f and the input's max is 255.0f. We will cast the float to uint8ts and pass the uint8ts to the DSP. Default: Pass the input as floats to the dsp directly and the DSP will quantize it. Opaque: Assumes input is float because the consumer layer(i.e next layer) requires it as float, therefore it won't be quantized. Choices supported: image default opaque --input_encoding INPUT_NAME INPUT_ENCODING, -e INPUT_NAME INPUT_ENCODING Image encoding of the source images. Default is bgr. Eg usage: "data" rgba Note the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify --input_encoding for each on the command line. Eg: --input_encoding "data1" rgba --input_encoding "data2" other Use options: color encodings(bgr,rgb, nv21...) if input is image; time_series: for inputs of rnn models; other: if input doesn't follow above categories or is unknown. Choices supported: bgr rgb rgba argb32 nv21 time_series other --debug [DEBUG] Run the converter in debug mode. --input_layout INPUT_NAME INPUT_LAYOUT, -l INPUT_NAME INPUT_LAYOUT Layout of each input tensor. If not specified, it will use the default based on the Source Framework, shape of input and input encoding. Accepted values are- NCDHW, NDHWC, NCHW, NHWC, NFC, NCF, NTF, TNF, NF, NC, F, NONTRIVIAL N = Batch, C = Channels, D = Depth, H = Height, W = Width, F = Feature, T = Time NDHWC/NCDHW used for 5d inputs NHWC/NCHW used for 4d image-like inputs NFC/NCF used for inputs to Conv1D or other 1D ops NTF/TNF used for inputs with time steps like the ones used for LSTM op NF used for 2D inputs, like the inputs to Dense/FullyConnected layers NC used for 2D inputs with 1 for batch and other for Channels (rarely used) F used for 1D inputs, e.g. Bias tensor NONTRIVIAL for everything elseFor multiple inputs specify multiple --input_layout on the command line. Eg: --input_layout "data1" NCHW --input_layout "data2" NCHW --udo_config_paths UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...], -udo UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...] Path to the UDO configs (space separated, if multiple) --dump_relay DUMP_RELAY Dump Relay ASM and Params at the path provided with the argument Usage: --dump_relay <path_to_dump> --disable_batchnorm_folding If not specified, converter will try to fold batchnorm into previous layer. --keep_disconnected_nodes Disable Optimization that removes Ops not connected to the main graph. This optimization uses output names provided over commandline OR inputs/outputs extracted from the Source model to determine the main graph -h, --help show this help message and exit -o OUTPUT_PATH, --output_path OUTPUT_PATH Path where the converted Output model should be saved.If not specified, the converter model will be written to a file with same name as the input model --copyright_file COPYRIGHT_FILE Path to copyright file. If provided, the content of the file will be added to the output model. --model_version MODEL_VERSION User-defined ASCII string to identify the model, only first 64 bytes will be stored --validation_target RUNTIME_TARGET PROCESSOR_TARGET A combination of processor and runtime target against which model will be validated. Choices for RUNTIME_TARGET: {cpu, gpu, dsp}. Choices for PROCESSOR_TARGET: {snapdragon_801, snapdragon_820, snapdragon_835}. If not specified, will validate model against {snapdragon_820, snapdragon_835} across all runtime targets. --strict If specified, will validate in strict mode whereby model will not be produced if it violates constraints of the specified validation target. If not specified, will validate model in permissive mode against the specified validation target. --udo_config_paths CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...], -udo CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...] Path to the UDO configs (space separated, if multiple) Quantizer Options: --quantization_overrides QUANTIZATION_OVERRIDES Use this option to specify a json file with parameters to use for quantization. These will override any quantization data carried from conversion (eg TF fake quantization) or calculated during the normal quantization process. Format defined as per AIMET specification. --keep_quant_nodes Use this option to keep activation quantization nodes in the graph rather than stripping them.
Examples of using this script can be found in Converting Models from TFLite to SNPE.
Additional details:
snpe-onnx-to-dlc converts a serialized ONNX model into a SNPE DLC file.
usage: snpe-onnx-to-dlc [-h] [--input_network INPUT_NETWORK] [-o OUTPUT_PATH] [--copyright_file COPYRIGHT_FILE] [--model_version MODEL_VERSION] [--disable_batchnorm_folding] [--input_type INPUT_NAME INPUT_TYPE] [--input_dtype INPUT_NAME INPUT_DTYPE] [--input_encoding INPUT_NAME INPUT_ENCODING] [--input_layout INPUT_NAME INPUT_LAYOUT] [-n, --no_simplification] [--disable_batchnorm_folding] [--keep_disconnected_nodes] [--validation_target RUNTIME_TARGET PROCESSOR_TARGET] [--strict] [--debug [DEBUG]] [--dry_run [DRY_RUN]] [--udo_config_paths CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...]] Script to convert onnxmodel into a DLC file. optional arguments: -h, --help show this help message and exit required arguments: --input_network INPUT_NETWORK, -i INPUT_NETWORK Path to the source framework model. optional arguments: -o OUTPUT_PATH, --output_path OUTPUT_PATH Path where the converted Output model should be saved.If not specified, the converter model will be written to a file with same name as the input model --copyright_file COPYRIGHT_FILE Path to copyright file. If provided, the content of the file will be added to the output model. --model_version MODEL_VERSION User-defined ASCII string to identify the model, only first 64 bytes will be stored -n, --no_simplification Do not attempt to simplify the model automatically. This may prevent some models from properly converting when sequences of unsupported static operations are present. --disable_batchnorm_folding If not specified, converter will try to fold batchnorm into previous convolution layer --keep_disconnected_nodes Disable Optimization that removes Ops not connected to the main graph. This optimization uses output names provided over commandline OR inputs/outputs extracted from the Source model to determine the main graph --input_type INPUT_NAME INPUT_TYPE, -t INPUT_NAME INPUT_TYPE Type of data expected by each input op/layer. Type for each input is |default| if not specified. For example: "data" image.Note that the quotes should always be included in order to handle special characters, spaces,etc. For multiple inputs specify multiple --input_type on the command line. Eg: --input_type "data1" image --input_type "data2" opaque These options get used by DSP runtime and following descriptions state how input will be handled for each option. Image: input is float between 0-255 and the input's mean is 0.0f and the input's max is 255.0f. We will cast the float to uint8ts and pass the uint8ts to the DSP. Default: pass the input as floats to the dsp directly and the DSP will quantize it. Opaque: assumes input is float because the consumer layer(i.e next layer) requires it as float, therefore it won't be quantized.Choices supported:['image', 'default', 'opaque'] --input_dtype INPUT_NAME INPUT_DTYPE The names and datatype of the network input layers specified in the format [input_name datatype], for example: 'data' 'float32'. Default is float32 if not specified. Note that the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify multiple --input_dtype on the command line like: --input_dtype 'data1' 'float32' --input_dtype 'data2' 'float32' --input_encoding INPUT_NAME INPUT_ENCODING, -e INPUT_NAME INPUT_ENCODING Image encoding of the source images. Default is bgr. Eg usage: "data" rgba Note the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify --input_encoding for each on the command line. Eg: --input_encoding "data1" rgba --input_encoding "data2" other. Use options: color encodings(bgr,rgb, nv21...) if input is image; time_series: for inputs of rnn models; other: if input doesn't follow above categories or is unknown. Choices supported:['bgr', 'rgb', 'rgba', 'argb32', 'nv21', 'time_series', 'other'] --input_layout INPUT_NAME INPUT_LAYOUT, -l INPUT_NAME INPUT_LAYOUT Layout of each input tensor. If not specified, it will use the default based on the Source Framework, shape of input and input encoding. Accepted values are- NCDHW, NDHWC, NCHW, NHWC, NFC, NCF, NTF, TNF, NF, NC, F, NONTRIVIAL N = Batch, C = Channels, D = Depth, H = Height, W = Width, F = Feature, T = Time NDHWC/NCDHW used for 5d inputs NHWC/NCHW used for 4d image-like inputs NFC/NCF used for inputs to Conv1D or other 1D ops NTF/TNF used for inputs with time steps like the ones used for LSTM op NF used for 2D inputs, like the inputs to Dense/FullyConnected layers NC used for 2D inputs with 1 for batch and other for Channels (rarely used) F used for 1D inputs, e.g. Bias tensor NONTRIVIAL for everything elseFor multiple inputs specify multiple --input_layout on the command line. Eg: --input_layout "data1" NCHW --input_layout "data2" NCHW --validation_target RUNTIME_TARGET PROCESSOR_TARGET A combination of processor and runtime target against which model will be validated. Choices for RUNTIME_TARGET: {cpu, gpu, dsp}. Choices for PROCESSOR_TARGET: {snapdragon_801, snapdragon_820, snapdragon_835}.If not specified, will validate model against {snapdragon_820, snapdragon_835} across all runtime targets. --strict If specified, will validate in strict mode whereby model will not be produced if it violates constraints of the specified validation target. If not specified, will validate model in permissive mode against the specified validation target. --debug [DEBUG] Run the converter in debug mode. --dry_run [DRY_RUN] Evaluates the model without actually converting any ops, and returns unsupported ops/attributes as well as unused inputs and/or outputs if any. Leave empty or specify "info" to see dry run as a table, or specify "debug" to show more detailed messages only" --udo_config_paths UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...], -udo UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...] Path to the UDO configs (space separated, if multiple) Quantizer Options: --quantization_overrides QUANTIZATION_OVERRIDES Use this option to specify a json file with parameters to use for quantization. These will override any quantization data carried from conversion (eg TF fake quantization) or calculated during the normal quantization process. Format defined as per AIMET specification. --keep_quant_nodes Use this option to keep activation quantization nodes in the graph rather than stripping them.
For more information, see ONNX Model Conversion
snpe-pytorch-to-dlc converts a serialized PyTorch model into a SNPE DLC file.
usage: snpe-pytorch-to-dlc -d INPUT_NAME INPUT_DIM [--input_dtype INPUT_NAME INPUT_DTYPE] [--input_type INPUT_NAME INPUT_TYPE] [--input_dtype INPUT_NAME INPUT_DTYPE] [--input_encoding INPUT_NAME INPUT_ENCODING] [--debug [DEBUG]] [--input_layout INPUT_NAME INPUT_LAYOUT] [--dump_relay DUMP_RELAY] [--disable_batchnorm_folding] [--quantization_overrides QUANTIZATION_OVERRIDES] [--keep_quant_nodes] [--keep_disconnected_nodes] --input_network INPUT_NETWORK [-h] [-o OUTPUT_PATH] [--copyright_file COPYRIGHT_FILE] [--model_version MODEL_VERSION] [--validation_target RUNTIME_TARGET PROCESSOR_TARGET] [--strict] [--udo_config_paths CUSTOM_OP_CONFIG_PATHS [CUSTOM_OP_CONFIG_PATHS ...]] Script to convert PyTorch model into DLC required arguments: -d INPUT_NAME INPUT_DIM, --input_dim INPUT_NAME INPUT_DIM The names and dimensions of the network input layers specified in the format [input_name comma-separated- dimensions], for example: 'data' 1,3,224,224 Note that the quotes should always be included in order to handlespecial characters, spaces, etc. For multiple inputs specify multiple --input_dim on the command line like: --input_dim 'data1' 1,3,224,224 --input_dim 'data2' 1,50,100,3 --input_network INPUT_NETWORK, -i INPUT_NETWORK Path to the source framework model. optional arguments: --input_dtype INPUT_NAME INPUT_DTYPE The names and datatype of the network input layers specified in the format [input_name datatype], for example: 'data' 'float32'. Default is float32 if not specified. Note that the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify multiple --input_dtype on the command line like: --input_dtype 'data1' 'float32' --input_dtype 'data2' 'float32' --input_type INPUT_NAME INPUT_TYPE, -t INPUT_NAME INPUT_TYPE Type of data expected by each input op/layer. Type for each input is |default| if not specified. For example: "data" image.Note that the quotes should always be included in order to handle special characters, spaces,etc. For multiple inputs specify multiple --input_type on the command line. Eg: --input_type "data1" image --input_type "data2" opaque These options get used by DSP runtime and following descriptions state how input will be handled for each option. Image: Input is float between 0-255 and the input's mean is 0.0f and the input's max is 255.0f. We will cast the float to uint8ts and pass the uint8ts to the DSP. Default: Pass the input as floats to the dsp directly and the DSP will quantize it. Opaque: Assumes input is float because the consumer layer(i.e next layer) requires it as float, therefore it won't be quantized. Choices supported: image default opaque --input_dtype INPUT_NAME INPUT_DTYPE The names and datatype of the network input layers specified in the format [input_name datatype], for example: 'data' 'float32' Default is float32 if not specified Note that the quotes should always be included in order to handlespecial characters, spaces, etc. For multiple inputs specify multiple --input_dtype on the command line like: --input_dtype 'data1' 'float32' --input_dtype 'data2' 'float32' --input_encoding INPUT_NAME INPUT_ENCODING, -e INPUT_NAME INPUT_ENCODING Image encoding of the source images. Default is bgr. Eg usage: "data" rgba Note the quotes should always be included in order to handle special characters, spaces, etc. For multiple inputs specify --input_encoding for each on the command line. Eg: --input_encoding "data1" rgba --input_encoding "data2" other Use options: color encodings(bgr,rgb, nv21...) if input is image; time_series: for inputs of rnn models; other: if input doesn't follow above categories or is unknown. Choices supported: bgr rgb rgba argb32 nv21 time_series other --debug [DEBUG] Run the converter in debug mode. --input_layout INPUT_NAME INPUT_LAYOUT, -l INPUT_NAME INPUT_LAYOUT Layout of each input tensor. If not specified, it will use the default based on the Source Framework, shape of input and input encoding. Accepted values are- NCDHW, NDHWC, NCHW, NHWC, NFC, NCF, NTF, TNF, NF, NC, F, NONTRIVIAL N = Batch, C = Channels, D = Depth, H = Height, W = Width, F = Feature, T = Time NDHWC/NCDHW used for 5d inputs NHWC/NCHW used for 4d image-like inputs NFC/NCF used for inputs to Conv1D or other 1D ops NTF/TNF used for inputs with time steps like the ones used for LSTM op NF used for 2D inputs, like the inputs to Dense/FullyConnected layers NC used for 2D inputs with 1 for batch and other for Channels (rarely used) F used for 1D inputs, e.g. Bias tensor NONTRIVIAL for everything elseFor multiple inputs specify multiple --input_layout on the command line. Eg: --input_layout "data1" NCHW --input_layout "data2" NCHW --dump_relay DUMP_RELAY Dump Relay ASM and Params at the path provided with the argument Usage: --dump_relay <path_to_dump> --disable_batchnorm_folding If not specified, converter will try to fold batchnorm into previous layer. --keep_disconnected_nodes Disable Optimization that removes Ops not connected to the main graph. This optimization uses output names provided over commandline OR inputs/outputs extracted from the Source model to determine the main graph -h, --help show this help message and exit -o OUTPUT_PATH, --output_path OUTPUT_PATH Path where the converted Output model should be saved.If not specified, the converter model will be written to a file with same name as the input model --copyright_file COPYRIGHT_FILE Path to copyright file. If provided, the content of the file will be added to the output model. --model_version MODEL_VERSION User-defined ASCII string to identify the model, only first 64 bytes will be stored --validation_target RUNTIME_TARGET PROCESSOR_TARGET A combination of processor and runtime target against which model will be validated. Choices for RUNTIME_TARGET: {cpu, gpu, dsp}. Choices for PROCESSOR_TARGET: {snapdragon_801, snapdragon_820, snapdragon_835}. If not specified, will validate model against {snapdragon_820, snapdragon_835} across all runtime targets. --strict If specified, will validate in strict mode whereby model will not be produced if it violates constraints of the specified validation target. If not specified, will validate model in permissive mode against the specified validation target. --udo_config_paths UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...], -udo UDO_CONFIG_PATHS [UDO_CONFIG_PATHS ...] Path to the UDO configs (space separated, if multiple) Quantizer Options: --quantization_overrides QUANTIZATION_OVERRIDES Use this option to specify a json file with parameters to use for quantization. These will override any quantization data carried from conversion (eg TF fake quantization) or calculated during the normal quantization process. Format defined as per AIMET specification. --keep_quant_nodes Use this option to keep activation quantization nodes in the graph rather than stripping them.
For more information, see PyTorch Model Conversion
DESCRIPTION: ------------ snpe-platform-validator checks the SNPE compatibility/capability of a device. This tool runs on the device, rather than on the host, and requires a few additional files to be pushed to the device besides its own executable. Additional details below. REQUIRED ARGUMENTS: ------------------- --runtime <RUNTIME> Specify the runtime to validate. <RUNTIME> : gpu, dsp, aip, all. OPTIONAL ARGUMENTS: ------------------- --coreVersion Query the runtime core descriptor. --libVersion Query the runtime core library API. --testRuntime Run diagnostic tests on the specified runtime. --targetPath <DIR> The directory to save output on the device. Defaults to /data/local/tmp/platformValidator/output. --debug Turn on verbose logging. --help Show this help message.
Additional details:
bin/snpe-platform-validator lib/libcalculator.so lib/libsnpe_dsp_domains_v2.so lib/dsp/libcalculator_skel.so lib/dsp/libsnpe_dsp_v65_domains_v2_skel.so lib/dsp/libsnpe_dsp_v66_domains_v2_skel.so example: for pushing arm-android-clang6.0 variant to /data/local/tmp/platformValidator adb push $SNPE_ROOT/bin/arm-android-clang6.0/snpe-platform-validator /data/local/tmp/platformValidator/bin/snpe-platform-validator adb push $SNPE_ROOT/lib/arm-android-clang6.0 /data/local/tmp/platformValidator/lib adb push $SNPE_ROOT/lib/dsp /data/local/tmp/platformValidator/dsp
snpe-throughput-net-run concurrently runs multiple instances of SNPE for a certain duration of time and measures inference throughput. Each instance of SNPE can have its own model, designated runtime and performance profile. Please note that the "--duration" parameter is common for all instances of SNPE created.
DESCRIPTION: ------------ Example application demonstrating how to load concurrent SNPE objects using the SNPE C/C++ API. REQUIRED ARGUMENTS: ------------------- --container <FILE> Path to the DL container containing the network. --duration <VAL> Duration of time (in seconds) to run network execution. --use_cpu Use the CPU runtime for SNPE. --use_gpu Use the GPU float32 runtime for SNPE. --use_gpu_fp16 Use the GPU float16 runtime for SNPE. --use_dsp Use the DSP fixed point runtime for SNPE. --perf_profile <VAL> Specifies perf profile to set. Valid settings are "balanced" , "default" , "high_performance" , "sustained_high_performance" , "burst" , "power_saver" and "system_settings". NOTE: "balanced" and "default" are the same. "default" is being deprecated in the future. --use_aip Use the AIP fixed point runtime for SNPE OPTIONAL ARGUMENTS: ------------------- --debug Specifies that output from all layers of the network will be saved. --userbuffer_auto Specifies to use userbuffer for input and output, with auto detection of types enabled. Must be used with user specified buffer. --userbuffer_float Specifies to use userbuffer for inference, and the input type is float. Must be used with user specified buffer. --userbuffer_floatN Specifies to use userbuffer for inference, and the input type is float16 or float32. Must be used with user specified buffer. --userbuffer_tf8 Specifies to use userbuffer for inference, and the input type is tf8exact0. Must be used with user specified buffer. --userbuffer_tfN Specifies to use userbuffer for inference, and the input type is tf8exact0 or tf16exact0. Must be used with user specified buffer. --userbuffer_float_output Overrides the userbuffer output used for inference, and the output type is float. Must be used with user specified buffer. --userbuffer_floatN_output Overrides the userbuffer output used for inference, and the output type is float16 or float32. Must be used with user specified buffer. --userbuffer_tf8_output Overrides the userbuffer output used for inference, and the output type is tf8exact0. Must be used with user specified buffer. --userbuffer_tfN_output Overrides the userbuffer output used for inference, and the output type is tf8exact0 or tf16exact0. Must be used with user specified buffer. --storage_dir <DIR> The directory to store SNPE metadata files --version Show SNPE Version Number. --iterations <VAL> Number of times to iterate through entire input list --verbose Print more debug information. --skip_execute Don't do execution (just SNPE graph build/teardown) --json <FILE> Generated JSON report. --input_raw <FILE> Path to raw inputs for the network, seperated by ",". --enable_cpu_fallback Enables cpu fallback functionality. Defaults to disable mode. --udo_package_path <VAL,VAL> Path to UDO package with registration library for UDOs. Optionally, user can provide multiple packages as a comma-separated list. --priority_hint <VAL> Specifies hint for priority level. Valid settings are "low", "normal", "normal_high", "high". Defaults to normal. Note: "normal_high" is only available on DSP. --help Show this help message.
DESCRIPTION: ------------ snpe-platform-validator-py checks the SNPE compatibility/capability of a device. The output is saved in a CSV file in the "Output" directory, in a csv format. Basic logs are also displayed on the console. REQUIRED ARGUMENTS: ------------------- --runtime <RUNTIME> Specify the runtime to validate. <RUNTIME> : gpu, dsp, aip, all. --directory <ARTIFACTS> Path to the root of the unpacked SDK directory containing the executable and library files. OPTIONAL ARGUMENTS: ------------------- --buildVariant <VARIANT> Specify the build variant (e.g: arm-android-clang6.0(default), aarch64-android-clang6.0) to be validated. --deviceId Uses the device for running the adb command. Defaults to first device in the adb devices list. --coreVersion Outputs the version of the runtime that is present on the target. --libVersion Outputs the library version of the runtime that is present on the target. --testRuntime Runs a small program on the runtime and Checks if SNPE is supported for runtime. --targetPath <PATH> The path to be used on the device. Defaults to /data/local/tmp/platformValidator NOTE that this directory will be deleted before proceeding with validation. --remoteHost <REMOTEHOST> Run on remote host through remote adb server. Defaults to localhost. --debug Set to turn on debug log.
DESCRIPTION: ------------ This tool generates a UDO (User Defined Operation) package using a user provided config file. USAGE: ------------ snpe-udo-package-generator [-h] --config_path CONFIG_PATH [--debug] [--output_path OUTPUT_PATH] [-f] OPTIONAL ARGUMENTS: ------------------- -h, --help show this help message and exit --debug Returns debugging information from generating the package --output_path OUTPUT_PATH, -o OUTPUT_PATH Path where the package should be saved -f, --force-generation This option will delete the existing package Note appropriate file permissions must be set to use this option. REQUIRED_ARGUMENTS: ------------------- --config_path CONFIG_PATH, -p CONFIG_PATH The path to a config file that defines a UDO.