I am converting a PyTorch model to DLC format in order to run it on the GPU of the QRB5165. I noticed that under certain circumstances, mean reduce operations (torch.mean) return incorrect output when run on the GPU. When the model runs on the CPU, the output is as expected.
Here is a minimum working example to reproduce the issue:
import torch class MyNet(torch.nn.Module): def forward(self, x): x = torch.mean(x, dim=3) return x model = MyNet() model.eval() example = torch.rand(1, 16, 8, 4) script_module = torch.jit.trace(model, example) script_module.save("mynet.pt")
I convert the PyTorch model to DLC format via:
snpe-pytorch-to-dlc -i mynet.pt --input_dim input 1,16,8,4 --input_encoding x other other
I then run the model via snpe-net-run on some random input, once on the CPU and once on the GPU (gpu_mode=default) and compare the raw outputs. In both cases, the output consists of 16*8=128 values, as expected. In the GPU output, however, only the first 16 values are correct, all others are zero. In the CPU output, all 128 values are non-zero, as expected.
I did some further investigations and noticed the following:
- The problem only occurs when dim=3 is passed to torch.mean. It does not occur when computing the mean over any other dimension.
- The problem does not occur when keepdim=True is set in the torch.mean call. So as a workaround, one can set keepdim=True and apply torch.squeeze on the result afterwards.
I am using version 1.59.0 of SNPE.
Dear customer,
Could you please help to share the dummy model to us more deeply analysis for this issue about model accuracy?
That may cause by the execution difference on GPU but needs to more deeply research.
BR.
Wei
Dear weihuan,
thanks! I have uploaded the scripts and model files I used to reproduce the issue to https://audeering-my.sharepoint.com/:u:/p/chausner/EYG2ypkvkYBLk6dSBg0OP.... Let me know if you need anything else.