Running 16-bit networks on DSP
Mon, 2023-04-24 03:22

Hi, I used Aimet to train a 16bit model (16bit activation + 8 bit weights) with per-axis quantization. I try to run the model on the DSP but the results I get from it are bad. I create a raw file which consist the image in float32 format and I get a very bad result. When I run the exact same model on the CPU I get good results. 

So, my question is. How should I run a 16bit model in DSP.

I'm using snapdragon 888 with snpe 2.5.*

The command I used for generating the DLC:

npe-onnx-to-dlc --input_network fld_20230417_183923_quant.onnx --output fld_20230417_183923_quant.dlc --quantization_overrides fld_20230417_183923_quant.encodings

Quantize the model:

snpe-onnx-to-dlc --input_network fld_20230417_183923_quant.onnx --output fld_20230417_183923_quant.dlc --quantization_overrides fld_20230417_183923_quant.encodings


And for running the model:

snpe-net-run --verbose --container $dlc_loc_device --input_list $image_list_short_path --output_dir $device_work_dir --set_unconsumed_as_output --use_dsp



Sat, 2023-04-29 08:47

Dear developer,

Good to see the AIMET has been integrated to your project.

You can raise this issue to AIMET Github repo for more help.

GitHub - quic/aimet: AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.



Thu, 2023-05-11 08:49

Hi SNPE team

This is Abhi from the AIMET team. I reviewed the script that @almogdavid used for applying AIMET AutoQuant and then QAT. His approach seems correct.

I am guessing that there are missing arguments when invoking snpe

- Should we not specify 8-bits for weights and 16 bits for activations?

- Is there a flag to specify 32-bit bias? We should set that.

- Any others?


