Hi Qualcomm Team,
Using QNN2.8, my model in ONNX format can be converted to and running on x86 backend successfully.
However, when I tried to convert the same ONNX file to FP16 then run it on HTP emulator. I found my input data was corrupted every other element. Below is how I convert, build, and run on emulator:
!qnn-onnx-converter --input_network simple_linear_regression.onnx --float_bw 16 --output_path simple_linear_regression_fp16.cpp --input_dim "input" 1,1,2
!qnn-model-lib-generator -c simple_linear_regression_fp16.cpp -b simple_linear_regression_fp16.bin -t x86_64-linux-clang
!qnn-net-run --model libs/x86_64-linux-clang/libsimple_linear_regression_fp16.so \
Then I read back the input data from emulator's output folder:
x_raw = np.fromfile("simple_linear_regression_onnx_fp16/Result_0/input_ncf.raw", dtype=np.float16)
display(x_raw)
And I got: array([-2. , -1.006], dtype=float16)
But the original input data I provided is array([[[-1.592, -1.006]]], dtype=float16)
The 1st element (FP16 format) is corrupted to -2.
Note: this is a simplified model, which is a simple linear regression: y=2*x0-3.4*x1+4.2.
In my original complex model, the input shape is [1, 80, 201], and I can observe that every other element in each row is corrupted to -2.
Is this a bug in HTP emulator? Thanks!
Dear developer,
QNN is for specific Qualcomm customer. Could you please help to clarify below info to us?
1 What's your platform for this issue, is Auto or others?
2 What's company name your issue from?
BR.
Wei
Thanks Wei! It doesn't matter now because I figured it out that by default qnn-net-run assumes the input data is in float32, even the converted model is float16.