Forums - Difference between outputs of Float32 and Float16 precision on QNN GPU and QNN HTP

4 posts / 0 new

or Register

Last post

Difference between outputs of Float32 and Float16 precision on QNN GPU and QNN HTP

xarius

Join Date: 8 Sep 23

Posts: 4

Posted: Fri, 2023-09-08 05:31

Top

Dear Qualcomm Team,

Architecture: SM7450

QNN version: 2.12.0.230626

I have a simple Conv2D tflite model, the architecture is Input -> Conv2D -> Output. The input and output are of the same shape which are (299, 299, 3). The Conv2D layer does not have any bias, and I assigned all of the kernel weights to 10.

In order to check the precision, I create a dummy data which is a 299x299x3 tensor with values of 1.0.

Command to generate FP32 model:

${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-tflite-converter --input_network test.tflite --input_dim "serving_default_input_1:0" 1,299,299,3 --output_path test.cpp

Command to generate FP16 model:

${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-tflite-converter --input_network test.tflite --input_dim "serving_default_input_1:0" 1,299,299,3 --float_bw 16 --output_path test.cpp

Then I generate .so files for FP16 and FP32 models respectively.

I pushed the .so files and dummy data to SM7450 phone.

Command to inference on phone:

./qnn-net-run --backend <libQnnGpu.so or libQnnHtp.so> --model <fp32.so or fp16.so> --input_list target_raw_list.txt

By using fp32 model on GPU and HTP, I managed to get result of 270, which is correct since the formula to calculate convolution result of one point is 10*3*3*1.0*3=270

However, result of fp16 model is close to zero, which is around 1e-27.

May I know is the above commands the correct way to generate FP16 model and doing inference? How could I solve this precision problem?

Thanks.

Forum vote up/down

Re: Difference between outputs of Float32 and Float16... #1

manuel.k

Join Date: 27 Feb 23

Posts: 15

Posted: Fri, 2023-09-08 11:10

Top

Dear Chan, dear Qualcomm Team,

just wanted to highlight again that we are running into the same problem when using Float16 on the HTP - see my post from a few days ago.

Any help here would be highly appreciated!

Best regards,

Manuel

Re: Difference between outputs of Float32 and Float16... #2

xarius

Join Date: 8 Sep 23

Posts: 4

Posted: Sun, 2023-09-10 19:09

Top

Hi Manuel,

Are you using the same commands to convert model fo FP16 precision and inference?

Regards,

Yi Xuan.

Re: Difference between outputs of Float32 and Float16... #3

manuel.k

Join Date: 27 Feb 23

Posts: 15

Posted: Mon, 2023-09-11 10:03

Top

Hi Yi Xuan,

Yes I use the same kind of command. My model is named differently of course, but the rest is the same otherwise.

or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Sort By

Filter Results