For the models having layers that are completely supported on NPU & DSP individually, the inference time between the AIP runtime & DSP runtime does not seem to vary much, sometimes DSP runtime giving better performance than AIP runtime.
Is AIP runtime not expected to give significantly better performance when compared to DSP runtime if all the layers are supported on NPU/HTA?
If not, what is the value add expected from AIP runtime?
Hi prashant,
The statement "The higher processing power of DSP will help my model perform better compared to GPU", is not always true.
We worked with Face Expression Recognition(FER) model built using Keras and converted to the DLC file,
Comparison of Total Inference Time for GPU and DSP draws to the conclusion that DSP performance is 60% to that of GPU. Before drawing that conclusion we also have an account of the time consumed for RPC Execute ( acts as a communicator between CPU/GPU and DSP), SNPE Accelerator and Accelerator. On considering these mentioned parameters, it looks GPU is performing better than DSP for a single/lesser number of predictions. We can choose DSP as a run time only if we required to make a higher number of predictions using the FER model.
Request you to check the beanchmarking application from SNPE with multiple itterations and compare the results.
Here you can find the instructions on the usage of Bencharking Tool.
Hi,
Thanks for your inputs.
I do understand that the DSP performance is better than GPU.
My question is whether AIP runtime that is using NPU/HTA(Hexagon Tensor Accelerator) is better than DSP or not?
The total inference time observed in AIP runtime and DSP Runtime is very close in numbers and sometimes DSP performs better than AIP. This is the case when the complete network is supported by HTA/NPU while running in AIP and there is no offloading to HVX or CPU.
So, what is the advantage of using HTA/NPU over DSP?
Request you to have a look at this : https://developer.qualcomm.com/docs/snpe/aip_runtime.html