Forums - SNPE net run on DSP with 16 bit activation give bad results

4 posts / 0 new
Last post
SNPE net run on DSP with 16 bit activation give bad results
almogdavid
Join Date: 4 Feb 20
Posts: 4
Posted: Mon, 2023-02-13 01:23

Quote:
Hi all.

I trained a model using torch and converted it to onnx and from there to DLC.

I quantized the DLC using snpe-dlc-quantize tool with 16 bit activations and 8 bit weights.

Afterwards I used snpe-net-run in order to run the model. When using the --use_dsp flag the model give bad results (random) while removing this flag yields normal results. I'm not using other flags than the input/output parameters and the --use_dsp flag

How can we run a model with 16 bit activations and 8 bit weights on DSP correctly?

I'm using SNPE with version: SNPE v2.5.0.4052

  • Up0
  • Down0
weihuan
Join Date: 12 Apr 20
Posts: 270
Posted: Sat, 2023-02-18 06:46

Dear developer,

You can take a look at below commands which can convert to 16bits.

      snpe-dlc-quantize --input_dlc ${MODEL}.dlc \
            --input_list ${INPUT_LIST} \
            --output_dlc ${MODEL}_quant.dlc \
            --enable_htp \
            --act_bitwidth 16 \
            --override_params

 

BR.

Wei

  • Up0
  • Down0
almogdavid
Join Date: 4 Feb 20
Posts: 4
Posted: Mon, 2023-02-27 05:39

Thanks for your comment, I still get high differences when running on DSP vs CPU (the exact same model)

When I use the command you shared with me it makes no difference and i still get high differences between the 2 runs.

Is there some memory limitation of the DSP? is there a way that I can debug it?

Thanks,

David.

  • Up0
  • Down0
almogdavid
Join Date: 4 Feb 20
Posts: 4
Posted: Tue, 2023-02-28 07:56

To make it more specific and to reproduce this issue fully:

I took a model from: https://google.github.io/mediapipe/solutions/face_mesh.html

Specificaly this model: https://storage.googleapis.com/mediapipe-assets/face_landmark.tflite

In order to convert the tflite model I used the next command: 

snpe-tflite-to-dlc --input_network face_landmark.tflite --output_path face_landmark.dlc --input_dim input_1 "1,192,192,3" --out_node conv2d_21

(BTW the output shape is not the same as the original model, but let's leave it aside its not affecting the problem I'm talking about)

 

Now, I quantized the model using the next command:

snpe-dlc-quantize --input_dlc face_landmark.dlc --output_dlc face_landmark_quant.dlc --weights_bitwidth=8 --act_bitwidth=16 --input_list /home/almog/workspace/docker_share/sample_data_for_fld_raw/image_list_short_full_path.txt --use_enhanced_quantizer --enable_htp --override_params

And run it on the device using the next 2 commands:

/data/local/tmp/snpeexample/snpe-net-run  --container $qat_model_loc --input_list /data/local/tmp/sample_data_for_fld_raw/image_list_short_full_path.txt --output_dir /data/local/tmp/precision_test/qat_dsp --use_dsp

/data/local/tmp/snpeexample/snpe-net-run  --container $qat_model_loc --input_list /data/local/tmp/sample_data_for_fld_raw/image_list_short_full_path.txt --output_dir /data/local/tmp/precision_test/qat_dsp --use_dsp

 

Than I compared the outputs of the 2 runs which are significantly different!

 

What is the cause for it? Am I doing something wrong?

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.