Hello,
I am trying to get a model to run on the AIP with a batch size > 1 on RB6/EB6. I have a base model with batch size of 1. I quantized it with --enable_hta and --enable_htp and am able to run it on DSP and AIP with a batch size of 1 with snpe-net-run and my own C++ code. I am able to resize it according to Snapdragon Neural Processing Engine SDK: Network Resizing (qualcomm.com) to run with larger batch size on CPU, GPU, and DSP, but when I try to resize the batch size on AIP I get "node input_tensor_1:0: B or C dimension resize is not supported. npu_resize_graph: graph resize failure"
According to "general AIP limitations" on Snapdragon Neural Processing Engine SDK: Limitations (qualcomm.com), I should manually partition the model when quantizing so that the input layer is run on HVX instead of HTA as laid out on Snapdragon Neural Processing Engine SDK: Adding HTA sections (qualcomm.com) . When I quantize the model with automatic partitioning, it states that layers 0-128 are partitioned to run on the HTA, and layer 129 (ArgMax) is partitioned to run on the HVX. Thus, when I quantized the model with manual partitioning I give it the argument --hta-partitions 1-128 so that all the same layers are running on the HTA except the 1st layer which is moved to run on the HVX. The quantization runs with no errors, and it says that it generates 3 subnets, 0-0 on HVX, 1-128 on HTA, and 129-129 on HVX.
However, when I try and run this model it fails to build the SNPE object, and actually crashes the RB6/EB6 a lot of the time. This is both when running it with batch size = 1 and when trying to resize the model to have larger batch size.
I also generated a model with a fixed batch size of 8, and generated the same quantized models both with manual and automatic partitioning. These also are unable to run on the AIP with snpe-net-run, and fail with error_code=910, error_message=DSP system runtime error. I have also tried the same things but removing the --enable_htp flag from the quantization process.
Is there any guidance on how to solve this problem? Has anyone successfully managed to get a model with batch size > 1 to run on the AIP?
Thanks.