Forums - Question about Inference time.

1 post / 0 new

or Register

Question about Inference time.

gunsuk.seo

Join Date: 11 May 20

Posts: 8

Posted: Sun, 2020-05-24 17:43

Top

Hello, I have a question about Inference time.

In order to run the deep learning model(onnx) in the desired runtime in snpe, I understand that two steps(+quantize : three steps) are required.

1. onnx -> dlc (snpe-onnx-to-dlc), 2. If you are using the quantize model, snpe-dlc-quantize, 3. snpe_bench.py.

If you look at https://developer.qualcomm.com/docs/snpe/tools.html, when creating dlc, there exsit the argument.

(--validation_target: runtime target against which model will be validated. Choices for RUNTIME_TARGET: {cpu, gpu, dsp}.)

What I want to know is, if I want to set the runtime with gpu or dsp and observe the inference time, don't I have to put the argument validation_taget (runtime target) in snpe-onnx-to-dlc?

Here are two reasons why I thought:

First, it is because there exist a task (snpe-dlc-quantize) to quantize after snpe-onnx-to-dlc to run in dsp. (when I want to runtime at the dsp)

Also, even if I want to run it in dsp, it seems that validation_target is not specified in snpe-onnx-to-dlc.

(In the example, when you make dlc, we do choose a runtime. (Tutorials Setup, Getting Inception v3), https://developer.qualcomm.com/docs/snpe/tutorial_setup.html#tutorial_se... )

"2. Run the script to download model and set up to run on DSP:

python $ SNPE_ROOT / models / inception_v3 / scripts / setup_inceptionv3.py -a ~ / tmpdir -d -r dsp "

Even if when I follow this code, snpe-onnx-to-dlc does not give validation_target.

And, at Benchmarking Overview (https://developer.qualcomm.com/docs/snpe/benchmarking.html)

As I explained earlier, when I run snpe-onnx-to-dlc without validation_target (probably the default is seen as cpu), then do snpe-dlc-quantize,

and executing snpe_bench.py with DSP in Runtime (Runtimes: Possible values are "GPU", "GPU_FP16", "DSP" and "CPU". You can use any combination of these.) in alexnet_sample.json(Profiling level is "detailed"), The result is the same as the homepage.

Generated csv:

avg(us) max(us) min(us) runtime

Total Inference Time 31874 31874 31874 CPU|DSP

...

layer_000 (Name:data Type:data) 106 106 106 DSP

layer_001 (Name:vgg0_conv0_fwd Type:convolutional) 0 0 0 DSP

layer_002 (Name:vgg0_relu0_fwd Type:neuron) 973 973 973 DSP

...

It show same result in actual tutorial example (Running AlexNet that is Shipped with the SDK) as above (runtime part of the colum in the table).

Total Inference Time 12752 12762 12742 CPU|DSP

...

layer_000 (Name:input 0 Type:data) 112 116 108 DSP

...

I know that this example also was measured by alexnet_sample.json with DSP as Runtime.

If I don't give it as validation_target in snpe-onnx-to-dlc to actually run it on GPU, and I give Runtime in snpe_bench.py (alexnet_sample.json) as GPU,

In the csv file, the runtime part of column of "Total Inference Time" and network layer part comes out as "CPU | GPU" and "GPU", respectively.

Finally, when I summarize my questions,

when running on a GPU or DSP, Is it does not matter if I only give the runtime argument at the snpe_bench.py (alexnet_sample.json), even if validation_target is not given in snpe-onnx-to-dlc?

This is very confusing.

Any help would be greatly appreciated.

Forum vote up/down

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Sort By

Filter Results