Release Notes

What's in Qualcomm Neural Processing SDK v2.10.0?

Features:
  • GPU Runtime: Support Pack operation with 1 input.
  • Core: Updated C API documentation for ITensor/Userbuffer creation indicating data size.
  • Core: setLogLevel() API hooked up to the runtimes for updating logging level after creating logger handle.
  • Tools: snpe-throughput-net-run now supports --userbuffer_auto option (similar to snpe-net-run) for automatic IO tensor data type detection.
  • Tools: Converters: Added a new optimization sequence to squash BatchNorm into FullyConnected.
  • Tools: Converters: Caffe2: Removed artifacts and documentation references to the deprecated Caffe2 converter.
  • Tools: Converters: Added gather_nd support in tflite/pytorch converter.
  • Tools: Converter: Changed the translation of FloorDiv operator to ElementWiseDivide if the datatype of input is Int32.
  • SDK: Mixed Precision section in documentation is enhanced.
Bugs:
  • HTP: Fixed issue with ElementwiseSin.
  • Tools: ONNX Converter: Fix conversion issues for GRU op related to number of positional arguments provided.
  • AIP Runtime: Fixed perf profile setting for multithread scenario.
  • Tool: Quantizer: fixed issue with encodings not being consumed properly for PRelu op, due to name mismatches with original model.
  • Tool: Quantizer: cleanup / fixes in LSTM op
Security Advisory:
  • CVE number: CVE-2023-28543
  • Security assessment rating: High
  • Public disclosure date: 04-September-2023
  • Title: Out of Bounds read in SNPE Library
  • Description: A malformed DLC can trigger Memory Corruption in SNPE library due to out of bounds read, such as by loading an untrusted model (e.g. from a remote source).
In case of any concerns, please send an email to [email protected] or create a new case in https://support.qualcomm.com. 

Known Issues:
  • Tools: Quantized models with LSTM Op may fail during inference. It will be fixed in future release.
  • DSP Runtime: Observing accuracy drop on some models on V66 compared to SNPE1
  • DSP Runtime: Performance profiles do not currently work properly on V66. This will be addressed in a future release.
  • DSP Runtime: The DSP runtime on devices with DSP architecture v66 has init & deinit performance regressions.
  • AIP Runtime: Per layer output dump is not working.
  • AIP runtime: some models show performance regressions for init, de-init and inference.

What's in Qualcomm Neural Processing SDK v2.09.0?

FEATURES
  • Core: Added new C API Snpe_SNPE_GetInputDimensionsOfFirstTensor() to facilitate retrieving Input dimension without Input tensor name
  • Tools: ONNX converter: Added support for NonMaxSuppression op.
  • Core: Adding new priority hint NORMAL_HIGH between HIGH and NORMAL for DSP runtime.
  • SNPE: Added documentation for Offline Graph Prepare.
BUG FIXES
  • Tools: snpe-dlc-graph-prepare fix benign error message during offline prepare for v68 based SoC s (--htp_socs sm8350, sm7350 etc)
  • Tool: snpe-dlc-quantizer: Fix in mixed precision for int to float conversion if output float tensor is overridden
  • CPU: Fixed accuracy issue in Grid Sample
Known Issues:
  • GPU Runtime : Some networks show accuracy issues on SM8550 due to bug in OpenCL, requires fix from META build
  • TF Converter: Some models with noops will fail to convert.
  • DSP Runtime: Observing accuracy drop on some models on V66 compared to SNPE1
  • Converters: Models containing LSTMs with multiple timesteps generate large DLCs. This will be addressed in a future release.
  • DSP Runtime: Performance profiles do not currently work properly on V66, leading to performance issues. This will be addressed in a future release.
  • AIP runtime: some models show performance regressions for init, de-init and inference.
  • SDK: Documentation: Images are missing for the new quantization and architecture checker documentation.
  • ONNX Converter: In some cases, an Add following a MatMul is not properly fused which can lead to accuracy issues. This will be addressed in a future release.
  • Core: There are some memory leaks in snpe-net-run and the SNPE Builder. These will be addressed in a future release.

What's in Qualcomm Neural Processing SDK v2.08.0?

FEATURES
  • Tools: Converters: Onnx: Added support for Sign
BUG FIXES
  • Tool: ONNX Converter: Fixed TransposeOp input axis format NT issue
  • HTP: solved vtcm overflow issue happened when change data layout: from uint8 flat to uint8 crouton in tcmP
Known Issues:
  • GPU Runtime: Some networks show accuracy issues on SM8550 due to bug in OpenCL, requires fix from META build
  • TF Converter: Some models with noops will fail to convert
  • DSP Runtime: Observing accuracy drop on some models on V66 compared to SNPE1
  • Converters: Models containing LSTMs with multiple timesteps generate large DLCs. This will be addressed in a future release.
  • DSP Runtime: Performance profiles do not currently work properly on V66, leading to performance issues. This will be addressed in a future release.
  • AIP runtime: some models show performance regressions for init, de-init and inference
  • SDK: Documentation: Images are missing for the new quantization and architecture checker documentation.
  • ONNX Converter: In some cases, an Add following a MatMul is not properly fused which can lead to accuracy issues. This will be addressed in a future release.
  • Core: There are some memory leaks in snpe-net-run and the SNPE Builder. These will be addressed in a future release.

What's in Qualcomm Neural Processing SDK v2.07.0?

BUG FIXES
  • Tools: Converters: Fixed a bug in the optimization that merges Matmul + Reshape + Add to FC Op that would incorrectly insert the FC Op before the Constant Bias Op
  • DSP Runtime: Fix uninitialized variable for perf setting on HTP
Known Issues:
  • DSP Runtime: LSTM is not supported for HTP. It will be addressed in a future release.
  • SNPE AIP: an Error log is printed on Ubuntu & LE platforms, which can be ignored
  • GPU Runtime: Some networks show accuracy issues on SM8550 due to bug in OpenCL, which requires fix from META build
  • TF Converter: Some models with noops will fail to convert
  • DSP Runtime: Some resnet models fail to prepare for 2MB VTCM
  • DSP Runtime: Observing some performance regressions on HTP FP16 for Elementwise Add and Mul when they don't fit in VTCM
  • DSP Runtime: Observing accuracy drop on some models on V66 compared to SNPE1
  • Converters: Models containing LSTMs with multiple timesteps generate large DLCs. This will be addressed in a future release.
  • DSP Runtime: Performance profiles do not currently work properly on V66, leading to performance issues. This will be addressed in a future release.
  • AIP runtime: some models show performance regressions for init, de-init and inference
  • AIP Runtime: EfficientDet Lite has an accuracy issue when running on the AIP runtime
  • SDK Documentation: Images are missing for the new quantization and architecture checker documentation
  • ONNX Converter: In some cases, an Add following a MatMul is not properly fused which can lead to accuracy issues. This will be addressed in a future release.
  • Core: In some cases internal exceptions are not properly caught after an SSR
  • Core: There are some memory leaks in snpe-net-run and the SNPE Builder. These will be addressed in a future release.
  • GPU Runtime: A change to the ONNX converter for Softmax will update dimensions so that they are not currently supported by the GPU runtime. A fix will be made in a future release.
  • Core: The C API call for Snpe_SNPE_GetModelVersion always returns the version for the first model that was loaded in a process
  • DSP Runtime: Squeezenet currently has accuracy issues on HTP
  • DSP Runtime: YOLO networks show some accuracy issues for HTP
  • AIP: Performance modes are not working. All networks run with Burst mode.

What's in Qualcomm Neural Processing SDK v2.05?

IMPORTANT NOTES
  • SNPE now defaults to Unsigned PD, and the delivered skels are not signed.
  • To use signed PD, please sign the skels, and enable the use of signed PD in the platform config options.
NEW FEATURES
  • Tools: Added new options for snpe-net-run and snpe-parallel-run --use_native_input_files and --use_native_output_files to support inputs in their native format as opposed to default float32 format.
  • Tools: Added new flag --userbuffer_auto in snpe-parallel-run to automatically detect and use the right buffer type based on tensor data type in the model.
  • Documentation: SNPE1 to SNPE2 migration guide is added.
BUG FIXES
  • Tools: snpe-throughput-net-run - capturing the status of lost thread in the result summary.
  • Tools: snpe-dlc-quant: Fixed abnormal DLC size increase when axis quantization is used.
  • Tool: Tensorflow Converter: Fixed issues with per-channel quantization of weights: set is_symmetric = true by default, added param "axis" and "is_symmetric" into weight encodings info.
  • HTP: solve vtcm overflow for transposeconv2d layer whose groups > 1, in depth= out depth, padding =0 and groups != in depth.
Known Issues:
  • DSP Runtime: Int4 models can see higher accuracy degradation than expected.
  • DSP Runtime: Observing accuracy issues on HTP for some networks using FP16.
  • Tools: Quantizer: The bc algorithm is not currently functional.
  • GPU Runtime: Some networks show accuracy issues due to bug in OpenCL, requires fix from META build.
  • Tools: Platform Validator will hang on some newer metabuilds. This is still being investigated with the platform team.
  • TF Converter: Some models with noops will fail to convert.
  • AIP runtime: --debug option in snpe-net-run is not functional.
  • AIP runitme: init_cache is not working.
  • AIP runtime: some models show inference regressions.
  • AIP runtime: some models show accuracy drop.
  • AIP runtime: some models with partitions between HTP and DSP fail in model initialization.
  • ONNX Converter: Quantization failure when there are multiple MatMul Ops with different weights dimension, and connected to a common input, the bias op (constant) is shared among them and therefore doesn't reflect the correct shape for each of them.
  • DSP Runtime: On HTP, when using lower performance profiles, de-init may take more time than in previous releases. This is because the DSP clock was artificially high in the earlier releases.
  • DSP Runtime: LSTM is not supported for HTP. It will be addressed in an upcoming release.
  • DSP Runtime: Squeezenet currently has accuracy issues on HTP.
  • DSP Runtime: YOLO networks show some accuracy issues for HTP.
  • DSP Runtime: Some models show inference regressions for HTP FP16.
  • Tools: Offline Prepare: Some models show issues related to offline prepare.
  • DSP Runtime: Some models are showing accuracy issues on HTP.
  • GPU Runtime: Some networks are showing accuracy issues.
  • Converters: TFLite Converter: Converter incorrectly handles weights for TransposeConv2D ops.
  • SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
  • GPU Runtime: Softmax layer doesn't support large tensor value in channel dimension.

What's in Qualcomm Neural Processing SDK v1.68.0?

NEW FEATURES
  • This release uses Android NDK 19c for building the Android code
  • On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
  • The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.
New Limitations
  • Core: Relaxed validation criteria for constant tensor for GPU backend.
  • SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115, SM4250, SM6225, QRB5165, QCS610LE, QCS605
  • SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants.
  • GPU Runtime: This release shows some performance regressions that will be addressed in the next release.
  • Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release.
  • DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325.
  • ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion.
  • SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
  • SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release.

What's in Qualcomm Neural Processing SDK v1.67.0?

NEW FEATURES
  • This release uses Android NDK 19c for building the Android code
  • On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
  • The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.
Known Issues:
  • SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
  • SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants.
  • GPU Runtime: This release shows some performance regressions that will be addressed in the next release.
  • Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release.
  • DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325.
  • ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion.
  • SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
  • SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release.

What's in Qualcomm Neural Processing SDK v1.66.0?

IMPORTANT INFORMATION
  • This release uses Android NDK 19c for building the Android code
  • On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
  • The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.
NEW FEATURES
  • Core: Added protection for loading malicious dlc file
Known Issues:
  • SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115, SM4250, SM6225, QRB5165, QCS610LE, QCS605
  • SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants.
  • GPU Runtime: This release shows some performance regressions that will be addressed in the next release.
  • Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release.
  • DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325.
  • ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
  • SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
  • SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release.

What's in Qualcomm Neural Processing SDK v1.65.0?

NEW FEATURES
  • This release uses Android NDK 19c for building the Android code
  • On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime
  • The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare
  • Core: Re-Enable LSTM support for CPU, GPU (HTP will follow)
  • DSP Runtime: Implemented rules for coexistence and selection of multiple cache records for HTP based on VTCM size, DSP Architecture, and SoC
  • Tool: Converter: Added optimization to fold scalar min + max to ReluMinMax
BUG FIXES
  • Tools: Offline Prepare: Fixed some issues for offline prepare for Depthwise Convolution with Dilation
Known Issues:
  • SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115, SM4250, SM6225, QRB5165, QCS610LE, QCS605
  • SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
  • GPU Runtime: This release shows some performance regressions that will be addressed in the next release
  • Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release
  • DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
  • ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
  • SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
  • SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release

What's in Qualcomm Neural Processing SDK v1.64.0?

NEW FEATURES
  • This release uses Android NDK 19c for building the Android code
  • On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime
  • The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare
  • Onnx Converter : Reenabled converter command line input dtype to take precedence over model specified
  • GPU: Improved accuracy in deepsort model, Resolved issues with Conv + elu op fusion
BUG FIXES
  • Quantizer: Fixed issue observed with applying 8-bit overrides using 16-bit default activation quantization encodings
  • SNPE Core: Fixed failure to select HTP offline cache for certain multi-subnet network topologies
  • Tools: Transform sub to addsub even when input2 is exactly input1
Known Issues:
  • SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
  • SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
  • GPU Runtime: This release shows some performance regressions that will be addressed in the next release
  • Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release
  • DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
  • ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
  • SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
  • SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release

What's in Qualcomm Neural Processing SDK v1.63.0?

Important Information
  • This release uses Android NDK 19c for building the Android code
  • Previously supported LE platforms, that were not supported in 1.62.0, are now reenabled in 1.63.0
  • On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime
  • The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare
  • When using the isRuntimeAvailable API, the same process domain must be used when calling the SNPEBuilder, with SNPE DSP runtime for HTP
NEW FEATURES
  • SNPE Core: Support PRELU bias broadcasting in SNPE
  • SNPE Core : snpe-diagview tool updated to display actual units (like cycles) instead of usec by default
  • SNPE Core: Open GL buffers supported for GPU backend
BUG FIXES
  • SNPE Core : Fixed Zip utility's std::istream index to internal extensible array to be const for every container(DLC) load
Known Issues:
  • GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
  • GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
  • GPU Runtime: This release shows some performance regressions that will be addressed in the next release
  • GPU Runtime: UDO is currently not supported for the GPU
  • Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in upcoming releases
  • DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
  • ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
  • SNPE Core : LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
  • SNPE Core : Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release

What's in Qualcomm Neural Processing SDK v1.62.0?

Important Information
  • This release uses Android NDK 19c for building the Android code
  • This release supports only Android targets, LE targets will return in SNPE 1.63.0
  • On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
  • The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.
NEW FEATURES
  • DSP Runtime: Perf improvement for FP16 models on HTP
  • SNPE Core: Upgrading SNPE's archiving library zlib from version 1.2.11 to version 1.2.12
  • SNPE Core: Validation results now persisted in Offline Cache thus reducing init time for a offline prepared dlc
  • SNPE Core : Relaxed dimension constraints for PRelu layer in SNPE to support broadcasting
  • DSP Runtime: Optimized performance of the Elementwise Div layer for V65 and V66
  • DSP Runtime: Added GatherV2 support
  • Tools: Converters: Added an optimization that merges low-level Ops into Prelu Op
  • Tools: Converters: Added an optimization to squash ReduceL2 and Div Op into L2Norm Op
BUG FIXES
  • Tools: Converters: TF: Fixed issue with translating explicit padding from Conv Op
  • Tools: Converters: Onnx: Fixed Onnx Concat axis
  • Tools: Converters: Onnx: Fixed implementation details for Conv1D and Pool1D Ops
  • Tools: Converters: Onnx: Added optimization folding continuous reshapes
Known Issues:
  • This release supports Android targets only. LE platforms will return in SNPE 1.63.0
  • SNPE GPU Runtime: OpenGL buffer is not supported
  • SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
  • SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
  • GPU Runtime: This release shows some performance regressions that will be addressed in the next release
  • GPU Runtime: UDO is currently not supported for the GPU
  • Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release
  • DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
  • ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
  • SNPE Core : LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
  • SNPE Core : Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release