Hi
I'm using a non-quantized DLC model for text recognize from jpeg.
The dlc input is 1,1,32,320. 1 Channel, H 32, W 320 jpeg. Including 6 charecters.
When I execute in CPU runtime, it returns in less 20ms.
But when I execute in GPU runtime, it returns in more than 1.2s !!
Could anyone encounted this problem, how to fix it?
Thanks.
Dear customer,
Could you please share the commands you used to analyze the problem.
BR.
Yunxiang
hi yunxqin,
Thanks for your reply.
I checked "Limitations and Issues" in snpe document, and found
And in my model, I reeally used nn.Conv2d by groups. Here is my model segment: