Forums - snpe-net-run fails on VOXL2 Adreno GPU, but works on CPU

1 post / 0 new
snpe-net-run fails on VOXL2 Adreno GPU, but works on CPU
dario.pisanti
Join Date: 1 Nov 23
Posts: 1
Posted: Thu, 2023-11-09 09:37

Hi,
I hope you could help me with the following issue.

SUMMARY:
I am trying to run a VGG model on a Model AI VOXL2, following the tutorial at https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/tutori...

Everything works fine when I run inference of the vgg16.dlc model (Step 7. of the tutorial) on the VOXL2 CPUs by running:

cd $SNPE_ROOT/examples/Models/VGG/data/cropped
snpe-net-run --input_list raw_list.txt --container ../../dlc/vgg16.dlc --output_dir ../../output

with the expected output:

-------------------------------------------------------------------------------
Model String: N/A
SNPE v2.15.4.231013125348_62905
-------------------------------------------------------------------------------
Processing DNN input(s):
/opt/qcom/aistack/snpe/2.15.4.231013/examples/Models/VGG/data/cropped/kitten.raw
Successfully executed!

However, when I enable GPU usage, by running:

snpe-net-run --input_list raw_list.txt --container ../../dlc/vgg16.dlc --output_dir ../../output --use_gpu

I get the following error:

error_code=201; error_message=Casting of tensor failed. error_code=201; error_message=Casting of tensor failed. Failed to create input tensor: vgg0_dense0_weight_permute for Op: vgg0_dense0_fwd error: 1002; error_component=Dl System; line_no=817; thread_id=547788872288; error_component=Dl System; line_no=277; thread_id=547865747472

HOW TO REPRODUCE:
I succesfully setup Qualcomm Neural Processing SDK on VOXL2 using the binaries in $SNPE_ROOT/bin/aarch64-ubuntu-gcc7.5 and I accordingly modified $SNPE_ROOT/bin/envsetup.sh for correct environment variables setup. 

I followed the instructions from steps1 to step 4 of the VGG tutorial at https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-2/tutori..., on VOXL2.

I converted the VGG ONNX model into Qualcomm SDK DLC format (step 5) on a Host machine running with Ubuntu 20.04 and a Clang 9 compiler installed, where I setup the Qualcomm Neural Processing SDK addressing the binaries in $SNPE_ROOT/bin/x86_64-linux-clang (the conversion operation is not supported on VOXL2 architecture).

I pushed the converted VGG model in DLC format to the VOXL2 and I followed the remaining instructions of the tutorial up to step 7, where I got the situation reported in the summary above.

MODEL AI VOXL2 SPECS:
Architecture: 
Aarch64
OS: Ubuntu 18.04
CPU: Qualcomm® QRB5165: 8 cores up to 3.091 GHz, 8GB LPDDR5
GPU: Adreno 650 GPU - 1024 ALU 
NPU: 15 TOPS AI embedded Neural Processing Unit
ONNX PYTHON PACKAGES: onnx==1.14.1, onnxruntime==1.16.1

HOST SPECS:
Architecture: 
x86
OS: Ubuntu 20.04
CPU: Intel(R) Xeon(R) W-2125 8 cores @ 4.00GHz
GPU: NVIDIA Corporation GP106GL [Quadro P2000]
ONNX PYTHON PACKAGES: onnx==1.14.1, onnxruntime==1.16.1

FURTHER DETAILS:
I checked the availability of GPU runtime on VOXL2, by performing snpe-platform-validator from my Host machine:
cd /opt/qcom/aistack/snpe/2.15.4.231013/bin/x86_64-linux-clang 
python3 snpe-platform-validator-py --runtime="all" --directory=/opt/qcom/aistack/snpe/2.15.4.231013 --buildVariant="aarch64-ubuntu-gcc7.5"

The platform validator results for GPU are:

Runtime supported: Supported
Library Prerequisites: Found
Library Version: Not Queried
Runtime Core Version: Not Queried
Unit Test: Passed
Overall Result: Passed

 

  • Up0
  • Down0

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.