SNPE net run on DSP with 16 bit activation give bad results
Posted: Mon, 2023-02-13 01:23

Hi all.

I trained a model using torch and converted it to onnx and from there to DLC.

I quantized the DLC using snpe-dlc-quantize tool with 16 bit activations and 8 bit weights.

Afterwards I used snpe-net-run in order to run the model. When using the --use_dsp flag the model give bad results (random) while removing this flag yields normal results. I'm not using other flags than the input/output parameters and the --use_dsp flag

How can we run a model with 16 bit activations and 8 bit weights on DSP correctly?

I'm using SNPE with version: SNPE v2.5.0.4052

Posted: Sat, 2023-02-18 06:46

Dear developer,

You can take a look at below commands which can convert to 16bits.

      snpe-dlc-quantize --input_dlc ${MODEL}.dlc \
            --input_list ${INPUT_LIST} \
            --output_dlc ${MODEL}_quant.dlc \
            --enable_htp \
            --act_bitwidth 16 \




Posted: Mon, 2023-02-27 05:39

Thanks for your comment, I still get high differences when running on DSP vs CPU (the exact same model)

When I use the command you shared with me it makes no difference and i still get high differences between the 2 runs.

Is there some memory limitation of the DSP? is there a way that I can debug it?



Posted: Tue, 2023-02-28 07:56

To make it more specific and to reproduce this issue fully:

I took a model from:

Specificaly this model:

In order to convert the tflite model I used the next command: 

snpe-tflite-to-dlc --input_network face_landmark.tflite --output_path face_landmark.dlc --input_dim input_1 "1,192,192,3" --out_node conv2d_21

(BTW the output shape is not the same as the original model, but let's leave it aside its not affecting the problem I'm talking about)


Now, I quantized the model using the next command:

snpe-dlc-quantize --input_dlc face_landmark.dlc --output_dlc face_landmark_quant.dlc --weights_bitwidth=8 --act_bitwidth=16 --input_list /home/almog/workspace/docker_share/sample_data_for_fld_raw/image_list_short_full_path.txt --use_enhanced_quantizer --enable_htp --override_params

And run it on the device using the next 2 commands:

/data/local/tmp/snpeexample/snpe-net-run  --container $qat_model_loc --input_list /data/local/tmp/sample_data_for_fld_raw/image_list_short_full_path.txt --output_dir /data/local/tmp/precision_test/qat_dsp --use_dsp

/data/local/tmp/snpeexample/snpe-net-run  --container $qat_model_loc --input_list /data/local/tmp/sample_data_for_fld_raw/image_list_short_full_path.txt --output_dir /data/local/tmp/precision_test/qat_dsp --use_dsp


Than I compared the outputs of the 2 runs which are significantly different!


What is the cause for it? Am I doing something wrong?

