Are there any details about how the official AI performance of RB5 are tested? The result is much different from what I have got from my RB5. Especially for the result of DSP (int8). For example, for RestNet50, my test result is 68.8 images/s but the official result is 125.1 inf/s. There are almost 2x speed gap. My results are tested with snpe-benchmark tool and the input resolution is 224x224x3 with batch size 1. I have set it to be run on burst mode. I also test with batch size 4 and it seems that for ResNet50, RB5 cannot really benefit from the large batch size. The speed for batch size 4 even slower than the speed for the batch size 1.
My result:
Models | DSP | G16
ResNet50 | 68.8 | 30.5
AI performance from Qualcomm:
Models | DSP | G16
ResNet50 | 125.1 | 37.4