Release Notes
What's in Qualcomm Neural Processing SDK v2.07.0?
BUG FIXES
- Tools: Converters: Fixed a bug in the optimization that merges Matmul + Reshape + Add to FC Op that would incorrectly insert the FC Op before the Constant Bias Op
- DSP Runtime: Fix uninitialized variable for perf setting on HTP
- DSP Runtime: LSTM is not supported for HTP. It will be addressed in a future release.
- SNPE AIP: an Error log is printed on Ubuntu & LE platforms, which can be ignored
- GPU Runtime: Some networks show accuracy issues on SM8550 due to bug in OpenCL, which requires fix from META build
- TF Converter: Some models with noops will fail to convert
- DSP Runtime: Some resnet models fail to prepare for 2MB VTCM
- DSP Runtime: Observing some performance regressions on HTP FP16 for Elementwise Add and Mul when they don't fit in VTCM
- DSP Runtime: Observing accuracy drop on some models on V66 compared to SNPE1
- Converters: Models containing LSTMs with multiple timesteps generate large DLCs. This will be addressed in a future release.
- DSP Runtime: Performance profiles do not currently work properly on V66, leading to performance issues. This will be addressed in a future release.
- AIP runtime: some models show performance regressions for init, de-init and inference
- AIP Runtime: EfficientDet Lite has an accuracy issue when running on the AIP runtime
- SDK Documentation: Images are missing for the new quantization and architecture checker documentation
- ONNX Converter: In some cases, an Add following a MatMul is not properly fused which can lead to accuracy issues. This will be addressed in a future release.
- Core: In some cases internal exceptions are not properly caught after an SSR
- Core: There are some memory leaks in snpe-net-run and the SNPE Builder. These will be addressed in a future release.
- GPU Runtime: A change to the ONNX converter for Softmax will update dimensions so that they are not currently supported by the GPU runtime. A fix will be made in a future release.
- Core: The C API call for Snpe_SNPE_GetModelVersion always returns the version for the first model that was loaded in a process
- DSP Runtime: Squeezenet currently has accuracy issues on HTP
- DSP Runtime: YOLO networks show some accuracy issues for HTP
- AIP: Performance modes are not working. All networks run with Burst mode.
What's in Qualcomm Neural Processing SDK v2.05?
IMPORTANT NOTES
- SNPE now defaults to Unsigned PD, and the delivered skels are not signed.
- To use signed PD, please sign the skels, and enable the use of signed PD in the platform config options.
- Tools: Added new options for snpe-net-run and snpe-parallel-run --use_native_input_files and --use_native_output_files to support inputs in their native format as opposed to default float32 format.
- Tools: Added new flag --userbuffer_auto in snpe-parallel-run to automatically detect and use the right buffer type based on tensor data type in the model.
- Documentation: SNPE1 to SNPE2 migration guide is added.
- Tools: snpe-throughput-net-run - capturing the status of lost thread in the result summary.
- Tools: snpe-dlc-quant: Fixed abnormal DLC size increase when axis quantization is used.
- Tool: Tensorflow Converter: Fixed issues with per-channel quantization of weights: set is_symmetric = true by default, added param "axis" and "is_symmetric" into weight encodings info.
- HTP: solve vtcm overflow for transposeconv2d layer whose groups > 1, in depth= out depth, padding =0 and groups != in depth.
- DSP Runtime: Int4 models can see higher accuracy degradation than expected.
- DSP Runtime: Observing accuracy issues on HTP for some networks using FP16.
- Tools: Quantizer: The bc algorithm is not currently functional.
- GPU Runtime: Some networks show accuracy issues due to bug in OpenCL, requires fix from META build.
- Tools: Platform Validator will hang on some newer metabuilds. This is still being investigated with the platform team.
- TF Converter: Some models with noops will fail to convert.
- AIP runtime: --debug option in snpe-net-run is not functional.
- AIP runitme: init_cache is not working.
- AIP runtime: some models show inference regressions.
- AIP runtime: some models show accuracy drop.
- AIP runtime: some models with partitions between HTP and DSP fail in model initialization.
- ONNX Converter: Quantization failure when there are multiple MatMul Ops with different weights dimension, and connected to a common input, the bias op (constant) is shared among them and therefore doesn't reflect the correct shape for each of them.
- DSP Runtime: On HTP, when using lower performance profiles, de-init may take more time than in previous releases. This is because the DSP clock was artificially high in the earlier releases.
- DSP Runtime: LSTM is not supported for HTP. It will be addressed in an upcoming release.
- DSP Runtime: Squeezenet currently has accuracy issues on HTP.
- DSP Runtime: YOLO networks show some accuracy issues for HTP.
- DSP Runtime: Some models show inference regressions for HTP FP16.
- Tools: Offline Prepare: Some models show issues related to offline prepare.
- DSP Runtime: Some models are showing accuracy issues on HTP.
- GPU Runtime: Some networks are showing accuracy issues.
- Converters: TFLite Converter: Converter incorrectly handles weights for TransposeConv2D ops.
- SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
- GPU Runtime: Softmax layer doesn't support large tensor value in channel dimension.
What's in Qualcomm Neural Processing SDK v1.68.0?
NEW FEATURES
- This release uses Android NDK 19c for building the Android code
- On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
- The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.
- Core: Relaxed validation criteria for constant tensor for GPU backend.
- SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115, SM4250, SM6225, QRB5165, QCS610LE, QCS605
- SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants.
- GPU Runtime: This release shows some performance regressions that will be addressed in the next release.
- Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release.
- DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325.
- ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion.
- SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
- SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release.
What's in Qualcomm Neural Processing SDK v1.67.0?
NEW FEATURES
- This release uses Android NDK 19c for building the Android code
- On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
- The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.
- SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
- SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants.
- GPU Runtime: This release shows some performance regressions that will be addressed in the next release.
- Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release.
- DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325.
- ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion.
- SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
- SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release.
What's in Qualcomm Neural Processing SDK v1.66.0?
IMPORTANT INFORMATION
- This release uses Android NDK 19c for building the Android code
- On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
- The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.
- Core: Added protection for loading malicious dlc file
- SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115, SM4250, SM6225, QRB5165, QCS610LE, QCS605
- SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants.
- GPU Runtime: This release shows some performance regressions that will be addressed in the next release.
- Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release.
- DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325.
- ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
- SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release.
- SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release.
What's in Qualcomm Neural Processing SDK v1.65.0?
NEW FEATURES
- This release uses Android NDK 19c for building the Android code
- On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime
- The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare
- Core: Re-Enable LSTM support for CPU, GPU (HTP will follow)
- DSP Runtime: Implemented rules for coexistence and selection of multiple cache records for HTP based on VTCM size, DSP Architecture, and SoC
- Tool: Converter: Added optimization to fold scalar min + max to ReluMinMax
- Tools: Offline Prepare: Fixed some issues for offline prepare for Depthwise Convolution with Dilation
- SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115, SM4250, SM6225, QRB5165, QCS610LE, QCS605
- SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
- GPU Runtime: This release shows some performance regressions that will be addressed in the next release
- Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release
- DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
- ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
- SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
- SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release
What's in Qualcomm Neural Processing SDK v1.64.0?
NEW FEATURES
- This release uses Android NDK 19c for building the Android code
- On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime
- The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare
- Onnx Converter : Reenabled converter command line input dtype to take precedence over model specified
- GPU: Improved accuracy in deepsort model, Resolved issues with Conv + elu op fusion
- Quantizer: Fixed issue observed with applying 8-bit overrides using 16-bit default activation quantization encodings
- SNPE Core: Fixed failure to select HTP offline cache for certain multi-subnet network topologies
- Tools: Transform sub to addsub even when input2 is exactly input1
- SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
- SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
- GPU Runtime: This release shows some performance regressions that will be addressed in the next release
- Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release
- DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
- ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
- SNPE Core: LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
- SNPE Core: Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release
What's in Qualcomm Neural Processing SDK v1.63.0?
Important Information
- This release uses Android NDK 19c for building the Android code
- Previously supported LE platforms, that were not supported in 1.62.0, are now reenabled in 1.63.0
- On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime
- The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare
- When using the isRuntimeAvailable API, the same process domain must be used when calling the SNPEBuilder, with SNPE DSP runtime for HTP
- SNPE Core: Support PRELU bias broadcasting in SNPE
- SNPE Core : snpe-diagview tool updated to display actual units (like cycles) instead of usec by default
- SNPE Core: Open GL buffers supported for GPU backend
- SNPE Core : Fixed Zip utility's std::istream index to internal extensible array to be const for every container(DLC) load
- GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
- GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
- GPU Runtime: This release shows some performance regressions that will be addressed in the next release
- GPU Runtime: UDO is currently not supported for the GPU
- Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in upcoming releases
- DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
- ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
- SNPE Core : LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
- SNPE Core : Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release
What's in Qualcomm Neural Processing SDK v1.62.0?
Important Information
- This release uses Android NDK 19c for building the Android code
- This release supports only Android targets, LE targets will return in SNPE 1.63.0
- On HTP targets, the mechanism for handling floating point inputs and outputs has changed. For best performance, please specify the --use_float_io argument to the quantizer for offline prepare or the --buffer_data_type argument to both the quantizer and the runtime.
- The HTP stub and skel artifacts have been renamed to libSnpeHtpV68Stub/Skel and libSnpeHtpV69Stub/Skel. Also, there is a separate libSnpeHtpPrepare.so for performing online prepare.
- DSP Runtime: Perf improvement for FP16 models on HTP
- SNPE Core: Upgrading SNPE's archiving library zlib from version 1.2.11 to version 1.2.12
- SNPE Core: Validation results now persisted in Offline Cache thus reducing init time for a offline prepared dlc
- SNPE Core : Relaxed dimension constraints for PRelu layer in SNPE to support broadcasting
- DSP Runtime: Optimized performance of the Elementwise Div layer for V65 and V66
- DSP Runtime: Added GatherV2 support
- Tools: Converters: Added an optimization that merges low-level Ops into Prelu Op
- Tools: Converters: Added an optimization to squash ReduceL2 and Div Op into L2Norm Op
- Tools: Converters: TF: Fixed issue with translating explicit padding from Conv Op
- Tools: Converters: Onnx: Fixed Onnx Concat axis
- Tools: Converters: Onnx: Fixed implementation details for Conv1D and Pool1D Ops
- Tools: Converters: Onnx: Added optimization folding continuous reshapes
- This release supports Android targets only. LE platforms will return in SNPE 1.63.0
- SNPE GPU Runtime: OpenGL buffer is not supported
- SNPE GPU Runtime: VGG-16 and VGG-19 networks are not supported on SM6115,SM4250,SM6225,QRB5165,QCS610LE,QCS605
- SNPE GPU Runtime: Some networks are showing minor mAP variations: Inception, Mobilenet, Resnet and VGG variants
- GPU Runtime: This release shows some performance regressions that will be addressed in the next release
- GPU Runtime: UDO is currently not supported for the GPU
- Tools: Android Sample App: UDO support in the Android Sample App is temporarily broken, and will be fixed in the next release
- DSP Runtime: Observing slight regression in accuracy for mobilenet V2 SSD and inception v3 model on SM8350 and SM7325
- ONNX Models like DETR with rank 3 inputs to Matmul followed by BiasAdd fail during conversion
- SNPE Core : LRN - Alpha scaling, Generate Proposals (Caffe2) and CropAndResize layers are not supported in this release
- SNPE Core : Support for Caffe2 BboxTransform and Caffe2 BoxWithNMSLimit is retired from this release
What's in Qualcomm Neural Processing SDK v1.61.0?
NEW FEATURES
- Converters: Onnx: Enabled support to handle custom op inputs correctly when the default values are provided
- ONNX Converter: Added support to resolve static ONNX Cast operation as Constant
- CPU Runtime: Supported CRD mode for depthtospace(pixelshuffle)
- ONNX Converter: Fix simplifier behavior with given input dimensions
- DSP Runtime: Added support for LayerNorm for V65/V66
- Converters: Added new pattern to fold ReduceL2 + Div as L2Norm
- Converters: Added support for Relay IR's requantize op that can be seen in framework quantized models
- Core: Improved performance of loading DLC from a memory buffer
- ONNX Converter: Fixes scale calculation for ONNX Resize Operator for align_corner mode. Also overrides Resize input axis format as per source axis order
- Caffe Converter: Added support for Caffe Scale where the scale weights are of shape [batch,channels] and axis == 0
- ONNX Converter: Fixed issues for Axis Tracking related to L2 Norm
- SDK: Update Sample Code to demonstrate handling multiple ITensor inputs
- AIP Runtime: Fixed low accuracy issue on mobilenet variant for Multi-class NMS layer
- ONNX Converters: Added support for combination of Nearest and Half_pixel modes for ResizeOp
- SNPE DSP: Observing error if second input to scale layer is having rank equal to 1
- Higher De-init time is observed on QRB5165 platform with CPU runtime for models like Mobilenet
What's in Qualcomm Neural Processing SDK v1.60.0?
NEW FEATURES
- Tools: Converter: Added ONNX Gemm transA and transB support
- Native sample code is updated to take static quantization parameters for quantized input buffers
- libSNPE.so, libcalculator.so, libplatformValidatorShared.so, libnpe_dsp_domains_v2.so – libraries generated with gcc7.5, gcc8.2 and gcc9.3 toolchain - are now compiled with additional read-only relocation compiler flags
- Documentation update: User Logging API documentation added in Application Tips section
- HTP: Fixed issue with Cast op usage in certain configurations
- ONNX Converter: Improvements to handle different input axis layouts
- Minor reduction in accuracy for VGG16 is observed
- Error: Model Validation fails for FC layer with error that there is a mismatch between weights and input dimensions
- Characteristic: Typically seen with ONNX models where the FC layer (with 4D input A and 2D input B) input follows Reshape layer either immediately or after some trivial eltwise layers
- Workaround: Insert reshape Op before FC to the input A with shape (orig_4D_shape[0], -1)
- Error: ONNX models with LSTM layer will have validation error related to input shape or will cause significant drop in accuracy
- Characteristic: LSTM models that have initial h/c input tensors will generally fail due to this issues
- Workaround: Provide command line argument "--input_layout
NONTRIVIAL" for each initial h/c input tensor for every LSTM Op - Error: AssertionError: LSTM h/c input buffer
needs to have format NONTRIVIAL, got NFC' - Characteristic: Failure seen with bidirectional lstm Layers
- Workaround: Provide command line argument "--input_layout
NONTRIVIAL" for each initial h/c input tensor for every LSTM Op - Minor reduction in accuracy for VGG16 is observed