I have created an ONNX model with 7 layers which includes 2 convolution layers, 2 sigmoid layers, 2 max pool layers, and a softmax layer. I processed this model using the qnn-onnx-converter, resulting in .cpp and .bin files. Subsequently, I utilized the qnn-model-lib-generator to generate a shared library (lib*.so).
After the successful generation of the model shared library, I employed the qnn_bench.py script from the QNN SDK to conduct detailed layer-wise profiling on a QRB5165 target on DSP Runtime. The command used for this profiling is:
python3 qnn_bench.py -c model_sigmoid.json -t aarch64-ubuntu-gcc7.5 --dsp_type v66 -l detailed -v ADA7D246
Inference time with Sigmoid avg(us)
Total Inference Time [Conv_0:OpId_0(us)] 392
Total Inference Time [Sigmoid_1:OpId_1(us)] 300
Total Inference Time [MaxPool_2:OpId_2(us)] 234
Total Inference Time [Conv_3:OpId_3(us)] 188
Total Inference Time [Sigmoid_4:OpId_4(us)] 84
Total Inference Time [MaxPool_5:OpId_5(us)] 40
Total Inference Time [_12_nchw:OpId_6(us)] 338
Total Inference Time [Gemm_7:OpId_7(us)] 795
Total Inference Time [Softmax_8:OpId_8(us)] 26
Total Inference Time [NetRun] 5461
Here are the layer wise computational times and total inference time acheived in the csv file for the model with ReLu activation layers:
Inference time with ReLu avg(us)
Total Inference Time [Conv_0:OpId_0(us)] 328
Total Inference Time [MaxPool_2:OpId_1(us)] 213
Total Inference Time [Conv_3:OpId_2(us)] 188
Total Inference Time [MaxPool_5:OpId_3(us)] 68
Total Inference Time [_12_nchw:OpId_4us)] 340
Total Inference Time [Gemm_7:OpId_5(us)] 801
Total Inference Time [Softmax_8:OpId_6(us)] 31
Total Inference Time [NetRun] 4914
I observed that the model with Sigmoid layers has expensive performance compared to model with ReLu layers.
I would like to understand why the Sigmoids are so expensive and if there is anything we can do about it. We can stick to only using networks that only use ReLu activations ofcourse, but that might hurt our resulting image quality.
Thanks,
Lavanya Varikuppala
Dear Lavanya,
ReLU performance is better than Sigmoid due to below reasons:
1. ReLUs layers are computationally simple. The forward and backward pass in ReLU are simple "if" statements. Sigmoid layers are computationally expensive because it require computing an exponent.
2. ReLU layer converges quickly so it takes less time compare to Sigmoid functions.
Thanks,
Rahul
Thanks Rahul.
Please let me know if there is anything that we can do with sigmoid to get better performance than relu.
Regards,
Lavanya Varikuppala
Which toolchain target did you use in the `qnn-model-lib-generator` command