Hello, I have a question about Inference time.
In order to run the deep learning model(onnx) in the desired runtime in snpe, I understand that two steps(+quantize : three steps) are required.
1. onnx -> dlc (snpe-onnx-to-dlc), 2. If you are using the quantize model, snpe-dlc-quantize, 3. snpe_bench.py.
If you look at https://developer.qualcomm.com/docs/snpe/tools.html, when creating dlc, there exsit the argument.
(--validation_target: runtime target against which model will be validated. Choices for RUNTIME_TARGET: {cpu, gpu, dsp}.)
What I want to know is, if I want to set the runtime with gpu or dsp and observe the inference time, don't I have to put the argument validation_taget (runtime target) in snpe-onnx-to-dlc?
Here are two reasons why I thought:
First, it is because there exist a task (snpe-dlc-quantize) to quantize after snpe-onnx-to-dlc to run in dsp. (when I want to runtime at the dsp)
Also, even if I want to run it in dsp, it seems that validation_target is not specified in snpe-onnx-to-dlc.
(In the example, when you make dlc, we do choose a runtime. (Tutorials Setup, Getting Inception v3), https://developer.qualcomm.com/docs/snpe/tutorial_setup.html#tutorial_se... )
"2. Run the script to download model and set up to run on DSP:
python $ SNPE_ROOT / models / inception_v3 / scripts / setup_inceptionv3.py -a ~ / tmpdir -d -r dsp "
Even if when I follow this code, snpe-onnx-to-dlc does not give validation_target.
And, at Benchmarking Overview (https://developer.qualcomm.com/docs/snpe/benchmarking.html)
As I explained earlier, when I run snpe-onnx-to-dlc without validation_target (probably the default is seen as cpu), then do snpe-dlc-quantize,
and executing snpe_bench.py with DSP in Runtime (Runtimes: Possible values are "GPU", "GPU_FP16", "DSP" and "CPU". You can use any combination of these.) in alexnet_sample.json(Profiling level is "detailed"), The result is the same as the homepage.
Generated csv:
avg(us) max(us) min(us) runtime
Total Inference Time 31874 31874 31874 CPU|DSP
...
layer_000 (Name:data Type:data) 106 106 106 DSP
layer_001 (Name:vgg0_conv0_fwd Type:convolutional) 0 0 0 DSP
layer_002 (Name:vgg0_relu0_fwd Type:neuron) 973 973 973 DSP
...
It show same result in actual tutorial example (Running AlexNet that is Shipped with the SDK) as above (runtime part of the colum in the table).
Total Inference Time 12752 12762 12742 CPU|DSP
...
layer_000 (Name:input 0 Type:data) 112 116 108 DSP
...
I know that this example also was measured by alexnet_sample.json with DSP as Runtime.
If I don't give it as validation_target in snpe-onnx-to-dlc to actually run it on GPU, and I give Runtime in snpe_bench.py (alexnet_sample.json) as GPU,
In the csv file, the runtime part of column of "Total Inference Time" and network layer part comes out as "CPU | GPU" and "GPU", respectively.
Finally, when I summarize my questions,
when running on a GPU or DSP, Is it does not matter if I only give the runtime argument at the snpe_bench.py (alexnet_sample.json), even if validation_target is not given in snpe-onnx-to-dlc?
This is very confusing.
Any help would be greatly appreciated.