Forums - Question about total inference time when inference multiple times.

2 posts / 0 new
Last post
Question about total inference time when inference multiple times.
gunsuk.seo
Join Date: 11 May 20
Posts: 8
Posted: Wed, 2020-05-27 16:31
Hi, I have a question about total inference time when inference multiple times.
 
Through snpe_bench.py, we can know the inference time of the network model.
I would like to get the inference time for multiple inputs instead of one, such as tutorial_inceptionv3 (https://developer.qualcomm.com/docs/snpe/tutorial_inceptionv3.html).
 
Based on the generated 5 input raw files, if we run snpe_bench.py the average time for 5 inferences appears. <-Is this right?
 
Since snpe needs time to create and initialize a network for the first time, which takes a lot of time.
I want to know the total inference time when all 5 inputs have been inference.
But if snpe-net-run is executed 5 times, the create&initialize time to create a network is required every single time, so the total inference time will be as follows.
total_inference_time = (network_create & initialize_time + network_inference_time) * 5 (number of images)
 
However, I want to do snpe-net-run only once (network create & initialize 1) and inference given 5 images, simultaneously.
This means that after the network have created first, we inference input images 5 times.
total_inference_time = network_create & initialize_time + (network_inference_time) * 5 (number of images)
 
How should I work in snpe bench.py to see calculated time?
If that's not possible with the snpe toolkit, do we just multiply 5 at inference time about one image for calculate 5 input images' inference time?

 

  • Up0
  • Down0
gesqdn-forum
Join Date: 4 Nov 18
Posts: 184
Posted: Tue, 2020-06-02 03:49


Hi Gunsuk,

Here is the detail about the working of the snpe_bench.py

When you run snpe_bench.py to test the performance of a model on any Qualcomm Hardware using Qualcomm NPE,

  1.  The model and the images to get the model inference are moved to the externally connected hardware.
  2.  snpe-net-run is executed with the given configuration (like performance profile, debug level, etc)
  3.  snpe-net-run load the required files for the complete process, initialize the setup and create the network
  4.  Once this is done the images are passed to the model for the inference one by one.
  5.  If the Runs variable is configured for more than once, the steps 3 and 4 are repeated for the count Runs is configured with


From the above explanation, I want to summarise that, if 5 images are provided for the model to inference using snpe_bench.py for 2 runs.

The numbers provided in the summary result sheet is the average of each aspect.
This means the create time is calculated for 2 Runs and averaged. So finally the creation time for the network mentioned in the summary sheet is for network creation.
Similarly, the inference time means, the total inference time for a single image which is being calculated for all the 5 images for 2 Runs and averaged it.

Below formula to be used according to your requirement,
total_inference_time = network_create & initialize_time + (network_inference_time) * 5 (number of images)

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.