Hello.
I was able to benchmark my model on GPU and see, that there two results rows: GPU_timing and GPU_ub_float_timing.
GPU_timing inference time: 67280 usec
GPU_ub_float_timing inference time: 1249774 usec
Actual model execution in the sample app takes about 240-320ms, measured time before and after actual execution.
Are there any tricks to enable mode mentioned in GPU_timing vs GPU_ub_float_timing or so?
And if the model already executed with mode mentioned in GPU_timing, why are my measurements differs from benchmark about 4 times 67ms vs 250ms
I'll appretiate for the answer, Thanks.
/*-->*/