When we do benchmarking on the gpu, the report contains "Total Inference Time" and "Forward Propagate".
What is the difference between these two terms?
Thanks!
When we do benchmarking on the gpu, the report contains "Total Inference Time" and "Forward Propagate".
What is the difference between these two terms?
Thanks!
Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.
Total Inference Time = total amount of time from the entry point of snpe->execute() until it returns.
Forward Propagate Time = total amount of time that an individual runtime (CPU, GPU, DSP) takes to run the inference.
So "Total Inference Time" is the time that callers of the API will see. "Forward Propagate Time" is encompassed as part of the Total, but only includes runtime specific code. There can be a difference between the two, depending on what type of buffers are used (Float, FloatUserBuffer, TF8UserBuffer) and some small amount of code that runs in SNPE before the actuall runtime (CPU, GPU,DSP) code is invoked.