Hi, all,
I found if I ran a un-quantized model (float32) using DSP or GPU_FP16. It doesn't have any issue.
Thus, I wanna ask if it did fp16-conversion before input layer?
thanks in advance
Hi, all,
I found if I ran a un-quantized model (float32) using DSP or GPU_FP16. It doesn't have any issue.
Thus, I wanna ask if it did fp16-conversion before input layer?
thanks in advance
Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.
Besides, I read docs and it said: SNPE doesn't support quantize UDL layer, but dsp runtime can automatically make quantization during SNPE model initialization.
Again, I tested quantized alexnet and non-quant. using dsp runtime, they have same inference time cost.