Hi All,
I've recently converted an ONNX model through the SNPE to a DLC and have run inference. I noticed the model running on the GPU or DSP runtime is much slower than CPU. I understand this can happen when certain layers are not supported, but I'm looking for more information to optimizing this.
(1) Is there an updated list of layers that are supported. I came across the table on the reference guide, but it had a note saying that the list is outdated.
(2) How can I characterize this performance gap. Looking for ways to view the graph and see the points at which runtimes are switched.
(3) This might be related to (2), but I have the same question for quantization. Does the quantizer recreate the graph based on the layer support for quantized inputs (adding quantized/dequantizer layers)?
I'm open to suggestions from the community on ways to dissect this further.
Hi,
Which SNPE version did you used?
Can you check with the latest SNPE version.
Thanks
I'm using 2.17.0.231124. Do you know where I can find the updated list for that version?
you can check in qualcomm package manager, check the below given link. You can able to download and view the list of version here.
https://qpm.qualcomm.com/#/main/tools/details/qualcomm_neural_processing...