Hi,
I tested a pytorch model of Unet with only 2d convolution, down/up sampling and Relu operators and converted to dlc with snpe-pytorch-to-dlc and quantized the model with snpe-dlc-quantize. When I deployed the model to run on snapdragon 888 DSP with snpe-net-run, the frist trial with originial parameters resulted in a fast initliaztion time (Accelerator Init Time). But when I reduced the network by half (reduce the channel number of each layer by half), the initialization time increased 5 times, which I cannot figure out the reasons since apart from the model width, all the other operators remains the same.
It is much appreciated if anyone can help to explain the situation. Thanks.
Dear customer,
What's conversion commands you used?
We're reommending to quantize the model with enable_htp options if you executed your model on DSP 888. The command is like as below
BR.
Wei