Dear QNN team,
when running compiling and running the Sample App or the compiled qnn-net-run on QNN v2.8 the process gets stuck when running on the HTP backend. Running on GPU works for both. Could you help us find the issue?
The output of qnn-net-run for GPU is simply:
````
./qnn-net-run --backend libQnnGpu.so --input_list inputs.txt --model libnet_g.so
qnn-net-run pid:17986
````
When running on HTP it is as below and then the process gets stuck:
````
./qnn-net-run --backend libQnnHtp.so --input_list inputs.txt --model libnet_g.so
qnn-net-run pid:18154
````
For HTP, I suggest you can run offline cache, this is because if the model is very large, compile on device will need long time, this is why you see "stuck".
To run offline cache
1 ) generate cache in host first
qnn-context-binray-generator --model libnet_g.so xxxx --output test.cache
2 ) on device,
qnn-net-run xxxx --retrieve_context test.cache
For HTP, I suggest you can run offline cache, this is because if the model is very large, compile on device will need long time, this is why you see "stuck".
To run offline cache
1 ) generate cache in host first
qnn-context-binray-generator --model libnet_g.so xxxx --output test.cache
2 ) on device,
qnn-net-run xxxx --retrieve_context test.cache
Hello and thank you for your response! Unfortunately, running from cache still hangs. We run the following command from adb shell:
cd /data/local/tmp/QNN-2.8
export LD_LIBRARY_PATH=$(pwd)
./qnn-context-binary-generator --model libnet_g.so --backend libQnnHtp.so --binary_file libnet_g_cache # Creates in output/
This leads to output
0.0ms [ INFO ] Model:
0.0ms [ INFO ] Backend: libQnnHtp.so
0.4ms [ INFO ] Model wasn't loaded from a shared library.
0.6ms [ INFO ] qnn-sample-app build version: v2.8.0.230223123141_52150
0.6ms [ INFO ] Backend build version: v2.8.0.230223123141_52150
0.6ms [ INFO ] Initializing logging in the backend. Callback: [0x5c7f40b378], Log Level: [3]
0.0ms [ INFO ] Qnn log initialized
0.0ms [ INFO ] addClient done (1). status 0x0
0.0ms [ INFO ] addClient started.
0.0ms [ INFO ] addClient done (1). status = 0x0
0.8ms [ INFO ] Initialize Backend Returned Status = 0
0.0ms [ INFO ] QnnDevice_create started. device = 0x1
0.0ms [ INFO ] exit with 0
0.0ms [ INFO ] exit with 0
0.0ms [ INFO ] exit with 0
0.0ms [ INFO ] exit with 0
0.0ms [ INFO ] exits with 2, successfully initialized rpc memory
0.0ms [ INFO ] rpcMemoryAlloc 8 isInit 1
0.0ms [ INFO ] rpcMemoryAlloc 136 isInit 1
0.0ms [ INFO ] setSkelLogLevelInternal return 0
0.0ms [ INFO ] Polling not supported in createDevicePollingContext
0.0ms [ INFO ] QnnDevice_create done. status 0x0
0.0ms [ INFO ] QnnContext_createFromBinary started, context = 0xc78b09b8
0.0ms [ INFO ] Polling not supported in initializeGraphStats
0.0ms [ INFO ] rpcMemoryAlloc 4194488 isInit 1
0.0ms [ INFO ] rpcMemoryAlloc 12583072 isInit 1
0.0ms [ INFO ] rpcMemoryAlloc 2656 isInit 1
0.0ms [ INFO ] RPC Memory allocated for graph net_g with In 0x72cfce2000 [4000b8 B], Inouts 0x75f0d9b000 [a60 B], Outs 0x72cf0e1000 [c000a0 B]
0.0ms [ INFO ] rpcMemoryAlloc 28 isInit 1
0.0ms [ INFO ] DmaHandleMod 2 fd 4194488 va 0x20800000
0.0ms [ INFO ] dmahandleMod Latency 1290
0.0ms [ INFO ] DmaHandleMod 2 fd 2656 va 0xfdffa000
0.0ms [ INFO ] dmahandleMod Latency 1273
0.0ms [ INFO ] DmaHandleMod 2 fd 12583072 va 0x23000000
0.0ms [ INFO ] dmahandleMod Latency 1047
0.0ms [ INFO ] rpcMemoryAlloc 3896016 isInit 1
0.0ms [ INFO ] rpcMemoryAlloc 32 isInit 1
0.0ms [ INFO ] QnnContext_setConfig context 1
0.0ms [ INFO ] rpcMemoryAlloc 48 isInit 1
0.0ms [ INFO ] QnnContext_setConfig done. status 0x0
0.0ms [ INFO ] QnnContext_createFromBinary done. status 0x0
0.0ms [ INFO ] QnnGraph_retrieve started. context = 0x1
0.0ms [ INFO ] QnnGraph_retrieve context 1 graph net_g
0.0ms [ INFO ] QnnGraph_retrieve done. found graph net_g.
0.0ms [ INFO ] QnnGraph_execute started. graph = 0x1
0.0ms [ INFO ] QnnGraph_execute started. graph = 0x1
I can see stuck when execute graph.
qnn-context-binary-generator --model libnet_g.so --backend libQnnHtp.so --binary_file libnet_g_cache
seems you run this cmd in device, I am not sure whether it is related with that, normally, we run context-birnay-generator in host to get cache file and run the cache file in device with qnn-net-run. For stuck, it is hard to debug as we may need to capture full ramdump to check
htp status.. It is not simple.
B.t.w, QNN is output QDN support and I will not reply any question about QNN, you can raise issue via creatpoint.