We are developing an inference runtime using Android's NNAPI. By installing the ailia AI showcase published below, yolox inference using NNAPI is possible.
Snapdragon 8+ works fine. yolox_tiny can be inferred in about 6ms. However, yolox_tiny inference takes nearly 4000ms on Snapdragon 888.
Here, if you put Transpose before Concat and set Transpose -> Concat -> Transpose, it will be 794ms.
Snapdragon 888 Concat is very slow. The Snapdragon 855 follows the same trend.
Is there a way to speed up Concat?
Dear developer,
We not fully understand your issue mentioned.
Concat and Transpose are supported on SNPE.
Thank you for your comment. We use NNAPI directly, not SNPE.