Release Notes

What's in Qualcomm Neural Processing SDK v2.10.0?

Features:

GPU Runtime: Support Pack operation with 1 input.
Core: Updated C API documentation for ITensor/Userbuffer creation indicating data size.
Core: setLogLevel() API hooked up to the runtimes for updating logging level after creating logger handle.
Tools: snpe-throughput-net-run now supports --userbuffer_auto option (similar to snpe-net-run) for automatic IO tensor data type detection.
Tools: Converters: Added a new optimization sequence to squash BatchNorm into FullyConnected.
Tools: Converters: Caffe2: Removed artifacts and documentation references to the deprecated Caffe2 converter.
Tools: Converters: Added gather_nd support in tflite/pytorch converter.
Tools: Converter: Changed the translation of FloorDiv operator to ElementWiseDivide if the datatype of input is Int32.
SDK: Mixed Precision section in documentation is enhanced.

Bugs:

HTP: Fixed issue with ElementwiseSin.
Tools: ONNX Converter: Fix conversion issues for GRU op related to number of positional arguments provided.
AIP Runtime: Fixed perf profile setting for multithread scenario.
Tool: Quantizer: fixed issue with encodings not being consumed properly for PRelu op, due to name mismatches with original model.
Tool: Quantizer: cleanup / fixes in LSTM op

Security Advisory:

CVE number: CVE-2023-28543
Security assessment rating: High
Public disclosure date: 04-September-2023
Title: Out of Bounds read in SNPE Library
Description: A malformed DLC can trigger Memory Corruption in SNPE library due to out of bounds read, such as by loading an untrusted model (e.g. from a remote source).

In case of any concerns, please send an email to [email protected] or create a new case in https://support.qualcomm.com.

Known Issues:

Tools: Quantized models with LSTM Op may fail during inference. It will be fixed in future release.
DSP Runtime: Observing accuracy drop on some models on V66 compared to SNPE1
DSP Runtime: Performance profiles do not currently work properly on V66. This will be addressed in a future release.
DSP Runtime: The DSP runtime on devices with DSP architecture v66 has init & deinit performance regressions.
AIP Runtime: Per layer output dump is not working.
AIP runtime: some models show performance regressions for init, de-init and inference.

What's in Qualcomm Neural Processing SDK v2.09.0?

FEATURES

Core: Added new C API Snpe_SNPE_GetInputDimensionsOfFirstTensor() to facilitate retrieving Input dimension without Input tensor name
Tools: ONNX converter: Added support for NonMaxSuppression op.
Core: Adding new priority hint NORMAL_HIGH between HIGH and NORMAL for DSP runtime.
SNPE: Added documentation for Offline Graph Prepare.

BUG FIXES

Tools: snpe-dlc-graph-prepare fix benign error message during offline prepare for v68 based SoC s (--htp_socs sm8350, sm7350 etc)
Tool: snpe-dlc-quantizer: Fix in mixed precision for int to float conversion if output float tensor is overridden
CPU: Fixed accuracy issue in Grid Sample

Known Issues:

GPU Runtime : Some networks show accuracy issues on SM8550 due to bug in OpenCL, requires fix from META build
TF Converter: Some models with noops will fail to convert.
DSP Runtime: Observing accuracy drop on some models on V66 compared to SNPE1
Converters: Models containing LSTMs with multiple timesteps generate large DLCs. This will be addressed in a future release.
DSP Runtime: Performance profiles do not currently work properly on V66, leading to performance issues. This will be addressed in a future release.
AIP runtime: some models show performance regressions for init, de-init and inference.
SDK: Documentation: Images are missing for the new quantization and architecture checker documentation.
ONNX Converter: In some cases, an Add following a MatMul is not properly fused which can lead to accuracy issues. This will be addressed in a future release.
Core: There are some memory leaks in snpe-net-run and the SNPE Builder. These will be addressed in a future release.

What's in Qualcomm Neural Processing SDK v2.08.0?

FEATURES

Tools: Converters: Onnx: Added support for Sign

BUG FIXES

Tool: ONNX Converter: Fixed TransposeOp input axis format NT issue
HTP: solved vtcm overflow issue happened when change data layout: from uint8 flat to uint8 crouton in tcmP

Known Issues:

GPU Runtime: Some networks show accuracy issues on SM8550 due to bug in OpenCL, requires fix from META build
TF Converter: Some models with noops will fail to convert
DSP Runtime: Observing accuracy drop on some models on V66 compared to SNPE1
Converters: Models containing LSTMs with multiple timesteps generate large DLCs. This will be addressed in a future release.
DSP Runtime: Performance profiles do not currently work properly on V66, leading to performance issues. This will be addressed in a future release.
AIP runtime: some models show performance regressions for init, de-init and inference
SDK: Documentation: Images are missing for the new quantization and architecture checker documentation.
ONNX Converter: In some cases, an Add following a MatMul is not properly fused which can lead to accuracy issues. This will be addressed in a future release.
Core: There are some memory leaks in snpe-net-run and the SNPE Builder. These will be addressed in a future release.

What's in Qualcomm Neural Processing SDK v2.07.0?

BUG FIXES

Tools: Converters: Fixed a bug in the optimization that merges Matmul + Reshape + Add to FC Op that would incorrectly insert the FC Op before the Constant Bias Op
DSP Runtime: Fix uninitialized variable for perf setting on HTP

Known Issues:

DSP Runtime: LSTM is not supported for HTP. It will be addressed in a future release.
SNPE AIP: an Error log is printed on Ubuntu & LE platforms, which can be ignored
GPU Runtime: Some networks show accuracy issues on SM8550 due to bug in OpenCL, which requires fix from META build
TF Converter: Some models with noops will fail to convert
DSP Runtime: Some resnet models fail to prepare for 2MB VTCM
DSP Runtime: Observing some performance regressions on HTP FP16 for Elementwise Add and Mul when they don't fit in VTCM
DSP Runtime: Observing accuracy drop on some models on V66 compared to SNPE1
Converters: Models containing LSTMs with multiple timesteps generate large DLCs. This will be addressed in a future release.
DSP Runtime: Performance profiles do not currently work properly on V66, leading to performance issues. This will be addressed in a future release.
AIP runtime: some models show performance regressions for init, de-init and inference
AIP Runtime: EfficientDet Lite has an accuracy issue when running on the AIP runtime
SDK Documentation: Images are missing for the new quantization and architecture checker documentation
ONNX Converter: In some cases, an Add following a MatMul is not properly fused which can lead to accuracy issues. This will be addressed in a future release.
Core: In some cases internal exceptions are not properly caught after an SSR
Core: There are some memory leaks in snpe-net-run and the SNPE Builder. These will be addressed in a future release.
GPU Runtime: A change to the ONNX converter for Softmax will update dimensions so that they are not currently supported by the GPU runtime. A fix will be made in a future release.
Core: The C API call for Snpe_SNPE_GetModelVersion always returns the version for the first model that was loaded in a process
DSP Runtime: Squeezenet currently has accuracy issues on HTP
DSP Runtime: YOLO networks show some accuracy issues for HTP
AIP: Performance modes are not working. All networks run with Burst mode.

What's in Qualcomm Neural Processing SDK v2.05?

IMPORTANT NOTES

SNPE now defaults to Unsigned PD, and the delivered skels are not signed.
To use signed PD, please sign the skels, and enable the use of signed PD in the platform config options.

NEW FEATURES

Tools: Added new options for snpe-net-run and snpe-parallel-run --use_native_input_files and --use_native_output_files to support inputs in their native format as opposed to default float32 format.
Tools: Added new flag --userbuffer_auto in snpe-parallel-run to automatically detect and use the right buffer type based on tensor data type in the model.
Documentation: SNPE1 to SNPE2 migration guide is added.

BUG FIXES

Tools: snpe-throughput-net-run - capturing the status of lost thread in the result summary.
Tools: snpe-dlc-quant: Fixed abnormal DLC size increase when axis quantization is used.
Tool: Tensorflow Converter: Fixed issues with per-channel quantization of weights: set is_symmetric = true by default, added param "axis" and "is_symmetric" into weight encodings info.
HTP: solve vtcm overflow for transposeconv2d layer whose groups > 1, in depth= out depth, padding =0 and groups != in depth.

Known Issues:

DSP Runtime: Int4 models can see higher accuracy degradation than expected.
DSP Runtime: Observing accuracy issues on HTP for some networks using FP16.
Tools: Quantizer: The bc algorithm is not currently functional.
GPU Runtime: Some networks show accuracy issues due to bug in OpenCL, requires fix from META build.
Tools: Platform Validator will hang on some newer metabuilds. This is still being investigated with the platform team.
TF Converter: Some models with noops will fail to convert.
AIP runtime: --debug option in snpe-net-run is not functional.
AIP runitme: init_cache is not working.
AIP runtime: some models show inference regressions.
AIP runtime: some models show accuracy drop.
AIP runtime: some models with partitions between HTP and DSP fail in model initialization.
ONNX Converter: Quantization failure when there are multiple MatMul Ops with different weights dimension, and connected to a common input, the bias op (constant) is shared among them and therefore doesn't reflect the correct shape for each of them.
DSP Runtime: On HTP, when using lower performance profiles, de-init may take more time than in previous releases. This is because the DSP clock was artificially high in the earlier releases.
DSP Runtime: LSTM is not supported for HTP. It will be addressed in an upcoming release.
DSP Runtime: Squeezenet currently has accuracy issues on HTP.
DSP Runtime: YOLO networks show some accuracy issues for HTP.
DSP Runtime: Some models show inference regressions for HTP FP16.
Tools: Offline Prepare: Some models show issues related to offline prepare.
DSP Runtime: Some models are showing accuracy issues on HTP.
GPU Runtime: Some networks are showing accuracy issues.
Converters: TFLite Converter: Converter incorrectly handles weights for TransposeConv2D ops.
SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
GPU Runtime: Softmax layer doesn't support large tensor value in channel dimension.

What's in Qualcomm Neural Processing SDK v1.68.0?

NEW FEATURES

This release uses Android NDK 19c for building the Android code
On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.

New Limitations

Core: Relaxed validation criteria for constant tensor for GPU backend.
SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115, SM4250, SM6225, QRB5165, QCS610LE, QCS605
SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants.
GPU Runtime: This release shows some performance regressions that will be addressed in the next release.
Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release.
DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325.
ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion.
SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release.

What's in Qualcomm Neural Processing SDK v1.67.0?

NEW FEATURES

This release uses Android NDK 19c for building the Android code
On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.

Known Issues:

SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants.
GPU Runtime: This release shows some performance regressions that will be addressed in the next release.
Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release.
DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325.
ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion.
SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release.

What's in Qualcomm Neural Processing SDK v1.66.0?

IMPORTANT INFORMATION

This release uses Android NDK 19c for building the Android code
On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.

NEW FEATURES

Core: Added protection for loading malicious dlc file

Known Issues:

SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115, SM4250, SM6225, QRB5165, QCS610LE, QCS605
SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants.
GPU Runtime: This release shows some performance regressions that will be addressed in the next release.
Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release.
DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325.
ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release.

What's in Qualcomm Neural Processing SDK v1.65.0?

NEW FEATURES

This release uses Android NDK 19c for building the Android code
On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime
The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare
Core: Re-Enable LSTM support for CPU, GPU (HTP will follow)
DSP Runtime: Implemented rules for coexistence and selection of multiple cache records for HTP based on VTCM size, DSP Architecture, and SoC
Tool: Converter: Added optimization to fold scalar min + max to ReluMinMax

BUG FIXES

Tools: Offline Prepare: Fixed some issues for offline prepare for Depthwise Convolution with Dilation

Known Issues:

SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115, SM4250, SM6225, QRB5165, QCS610LE, QCS605
SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
GPU Runtime: This release shows some performance regressions that will be addressed in the next release
Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release
DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release

What's in Qualcomm Neural Processing SDK v1.64.0?

NEW FEATURES

This release uses Android NDK 19c for building the Android code
On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime
The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare
Onnx Converter : Reenabled converter command line input dtype to take precedence over model specified
GPU: Improved accuracy in deepsort model, Resolved issues with Conv + elu op fusion

BUG FIXES

Quantizer: Fixed issue observed with applying 8-bit overrides using 16-bit default activation quantization encodings
SNPE Core: Fixed failure to select HTP offline cache for certain multi-subnet network topologies
Tools: Transform sub to addsub even when input2 is exactly input1

Known Issues:

SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
GPU Runtime: This release shows some performance regressions that will be addressed in the next release
Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release
DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release

What's in Qualcomm Neural Processing SDK v1.63.0?

Important Information

This release uses Android NDK 19c for building the Android code
Previously supported LE platforms, that were not supported in 1.62.0, are now reenabled in 1.63.0
On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime
The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare
When using the isRuntimeAvailable API, the same process domain must be used when calling the SNPEBuilder, with SNPE DSP runtime for HTP

NEW FEATURES

SNPE Core: Support PRELU bias broadcasting in SNPE
SNPE Core : snpe-diagview tool updated to display actual units (like cycles) instead of usec by default
SNPE Core: Open GL buffers supported for GPU backend

BUG FIXES

SNPE Core : Fixed Zip utility's std::istream index to internal extensible array to be const for every container(DLC) load

Known Issues:

GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
GPU Runtime: This release shows some performance regressions that will be addressed in the next release
GPU Runtime: UDO is currently not supported for the GPU
Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in upcoming releases
DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
SNPE Core : LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
SNPE Core : Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release

What's in Qualcomm Neural Processing SDK v1.62.0?

Important Information

This release uses Android NDK 19c for building the Android code
This release supports only Android targets, LE targets will return in SNPE 1.63.0
On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.

NEW FEATURES

DSP Runtime: Perf improvement for FP16 models on HTP
SNPE Core: Upgrading SNPE's archiving library zlib from version 1.2.11 to version 1.2.12
SNPE Core: Validation results now persisted in Offline Cache thus reducing init time for a offline prepared dlc
SNPE Core : Relaxed dimension constraints for PRelu layer in SNPE to support broadcasting
DSP Runtime: Optimized performance of the Elementwise Div layer for V65 and V66
DSP Runtime: Added GatherV2 support
Tools: Converters: Added an optimization that merges low-level Ops into Prelu Op
Tools: Converters: Added an optimization to squash ReduceL2 and Div Op into L2Norm Op

BUG FIXES

Tools: Converters: TF: Fixed issue with translating explicit padding from Conv Op
Tools: Converters: Onnx: Fixed Onnx Concat axis
Tools: Converters: Onnx: Fixed implementation details for Conv1D and Pool1D Ops
Tools: Converters: Onnx: Added optimization folding continuous reshapes

Known Issues:

This release supports Android targets only. LE platforms will return in SNPE 1.63.0
SNPE GPU Runtime: OpenGL buffer is not supported
SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
GPU Runtime: This release shows some performance regressions that will be addressed in the next release
GPU Runtime: UDO is currently not supported for the GPU
Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release
DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
SNPE Core : LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
SNPE Core : Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release

View older release notes

Release Notes

What's in Qualcomm Neural Processing SDK v2.10.0?

What's in Qualcomm Neural Processing SDK v2.09.0?

What's in Qualcomm Neural Processing SDK v2.08.0?

What's in Qualcomm Neural Processing SDK v2.07.0?

What's in Qualcomm Neural Processing SDK v2.05?

What's in Qualcomm Neural Processing SDK v1.68.0?

What's in Qualcomm Neural Processing SDK v1.67.0?

What's in Qualcomm Neural Processing SDK v1.66.0?

What's in Qualcomm Neural Processing SDK v1.65.0?

What's in Qualcomm Neural Processing SDK v1.64.0?

What's in Qualcomm Neural Processing SDK v1.63.0?

What's in Qualcomm Neural Processing SDK v1.62.0?

Sort By

Filter Results