Forums - Benchmarking Inception-v1: GPU has higher performance than DSP (SD835)

5 posts / 0 new
Last post
Benchmarking Inception-v1: GPU has higher performance than DSP (SD835)
atul.rahman
Join Date: 14 Nov 16
Posts: 3
Posted: Tue, 2017-08-08 21:58
Hello,
 
I ran all the three runtimes, CPU (not quantized, 32float), GPU (not quantized, 32float), and DSP (quantized, 8bit) with inception-v1 and inception-v3 graphs on Galaxy S8. In case of inception-v3 graph, the results meets the expectation i.e. DSP has 1.5x higher performance and 2.72x higher inferences/s/W than GPU. 
 
However, in case of Inception-v1, GPU has 1.7x higher performance and 1.46x higher inferences/s/W than DSP. Why is that the case? Is it because, GPU has higher internal memory and can better utilize the data on-chip incase of relatively smaller inception-v1 graph? 
 
I provided the result below for Galaxy S8 ( performance mode). The measured power is the load power.
 
Inception-v1,
Unit(inf/sec),   CPU --> 6.22,   DSP --> 21.94, GPU --> 37.30
Unit(inf/sec/W), CPU --> 9.80, DSP --> 38.48, GPU --> 56.28
 
 
Inception-v3,
Unit(inf/sec),   CPU --> 1.33, DSP --> 13.51, GPU --> 9.04
Unit(inf/sec/W), CPU --> 1.11, DSP --> 24.24, GPU --> 8.92
  • Up0
  • Down0
moljaca moderator
Join Date: 25 Jul 17
Location: San Diego
Posts: 40
Posted: Tue, 2017-08-29 08:48

Hi,

Thanks for interest in Snapdragon NPE.

What you observed is accurate due to some layers types / configurartions that are present in v1 network not being fully optimized for DSP runtime. The runtime optimizations are work-in-progress and are evolving, it will get better in future in SNPE SDK releases.

Best regards.

  • Up0
  • Down0
yangfan34
Join Date: 6 Dec 17
Posts: 9
Posted: Tue, 2018-05-08 23:27

Hi, 

I think the following parameters are very useful when evaluating the inference performance,

Inception-v1,
Unit(inf/sec),   CPU --> 6.22,   DSP --> 21.94, GPU --> 37.30
Unit(inf/sec/W), CPU --> 9.80, DSP --> 38.48, GPU --> 56.28
 

But I don't know how to measure inference performance with inference/sec/W on snapdragon device(i.e. SDM821).  Is there any special benchmark tool to obtain these parameters? Look forward to your suggestions and Thanks so much.

 

Yangfan

  • Up0
  • Down0
gesqdn-forum
Join Date: 4 Nov 18
Posts: 184
Posted: Tue, 2019-11-05 03:10

Hi,
The statement "The higher processing power of DSP will help my model perform better compared to GPU", is not always true.
We worked with  Face Expression Recognition(FER) model built using Keras and converted to the DLC file,

Comparison of Total Inference Time for GPU and DSP draws to the conclusion that  DSP performance is 60% to that of  GPU. Before drawing that conclusion we also have an account of the time consumed for RPC Execute ( acts as a communicator between CPU/GPU and DSP), SNPE Accealtor and Accelerator. On considering these mentioned parameters, it looks GPU is performing better than DSP for a single/lesser number of predictions. we can choose DSP as a run time only if we required to make a higher number of predictions using the FER model.
 

  • Up0
  • Down0
zhengxin.zhao
Join Date: 13 Mar 19
Posts: 12
Posted: Tue, 2019-11-05 18:37

I have the same question. How to measure infernece performance with inference/sec/W on snapdragon device(i.e. 855 or sa8155p)?

I know Snapdragon Profiler can give same profiling data but not include Watt.

Looking forward to your relay.

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.