Forums - Not able to get a valid accuracy on SD845 DSP runtime

4 posts / 0 new
Last post
Not able to get a valid accuracy on SD845 DSP runtime
v-shihha
Join Date: 7 Oct 19
Posts: 3
Posted: Mon, 2019-12-09 04:04

Hi, I am testing the accuracy lost due to quantization on Mobilenet V2 1.0 224 pretrained model, provided on Google's github.

I can reproduce the claimed accuracy on full imagenet-1k validation dataset on both CPU & GPU run-time, but it is not working on DSP/AIP run-time with both run-time quantization and post training quantization(by snpe-dlc-quantizer).

Top-1 Accuracy I got on CPU is 0.7184, on GPU is 0.7146, and on DSP is 0.01348 which is totally meaningless.

and... If I am using the quantized model on GPU run-time, I got almost same result. I guess there might be some issue on input or output preprocessing, but I cannot find any detail on doc. 

For all the benchmark, I am using the same pre-processed dataset (a series numpy saved file). 

Here is the log when I am doing quantization. It successfully extract the range of the input data, and I dont see anything weired.

snpe-dlc-quantize --input_dlc "../mobilenet_v2_1.0_224_frozen.dlc" --input_list "imgList.txt" --output_dlc "../mobilenet_v2_1.0_224_frozen_quant8_with7kpic.dlc" --verbose --enable_hta
[INFO] InitializeStderr: DebugLog initialized.
[INFO] Writing intermediate model
[WARNING] NetworkTopology::populateNetworkDesc network desc inputs is empty. Does this network have data/input layer?
[INFO] Setting activation for layer: input:0 and buffer: input:0
[INFO] bw: 8, min: -1.000000, max: 0.992188, delta: 0.007812, offset: -128.000000
[INFO] Setting activation for layer: MobilenetV2/Conv/Conv2D and buffer: MobilenetV2/Conv/BatchNorm/FusedBatchNorm:0
[INFO] bw: 8, min: -18.588268, max: 28.342507, delta: 0.184042, offset: -101.000000
[INFO] Setting activation for layer: MobilenetV2/Conv/Relu6 and buffer: MobilenetV2/Conv/Relu6:0
[INFO] bw: 8, min: 0.000000, max: 6.000000, delta: 0.023529, offset: 0.000000
[INFO] Setting activation for layer: MobilenetV2/expanded_conv/depthwise/depthwise and buffer: MobilenetV2/expanded_conv/depthwise/BatchNorm/FusedBatchNorm:0
[INFO] bw: 8, min: -28.662906, max: 36.018872, delta: 0.253654, offset: -113.000000
[INFO] Setting activation for layer: MobilenetV2/expanded_conv/depthwise/Relu6 and buffer: MobilenetV2/expanded_conv/depthwise/Relu6:0
[INFO] bw: 8, min: 0.000000, max: 6.000000, delta: 0.023529, offset: 0.000000
[INFO] Setting activation for layer: MobilenetV2/expanded_conv/project/Conv2D and buffer: MobilenetV2/expanded_conv/project/BatchNorm/FusedBatchNorm:0
[INFO] bw: 8, min: -47.089708, max: 43.879046, delta: 0.356740, offset: -132.000000
[INFO] Setting activation for layer: MobilenetV2/expanded_conv_1/expand/Conv2D and buffer: MobilenetV2/expanded_conv_1/expand/BatchNorm/FusedBatchNorm:0
[INFO] bw: 8, min: -52.806657, max: 63.276942, delta: 0.455230, offset: -116.000000
[INFO] Setting activation for layer: MobilenetV2/expanded_conv_1/expand/Relu6 and buffer: MobilenetV2/expanded_conv_1/expand/Relu6:0
[INFO] bw: 8, min: 0.000000, max: 6.000000, delta: 0.023529, offset: 0.000000
[INFO] Setting activation for layer: MobilenetV2/expanded_conv_1/depthwise/depthwise and buffer: MobilenetV2/expanded_conv_1/depthwise/BatchNorm/FusedBatchNorm:0
[INFO] bw: 8, min: -21.121309, max: 19.992689, delta: 0.161231, offset: -131.000000
[INFO] Setting activation for layer: MobilenetV2/expanded_conv_1/depthwise/Relu6 and buffer: MobilenetV2/expanded_conv_1/depthwise/Relu6:0
[INFO] bw: 8, min: 0.000000, max: 6.000000, delta: 0.023529, offset: 0.000000
[INFO] Setting activation for layer: MobilenetV2/expanded_conv_1/project/Conv2D and buffer: MobilenetV2/expanded_conv_1/project/BatchNorm/FusedBatchNorm:0
[INFO] bw: 8, min: -35.918198, max: 42.365054, delta: 0.306993, offset: -117.000000

Here are quantized and un-quantized dlc files, and some images for testing.

https://drive.google.com/open?id=1fvDt4XdcFL7RpOhikW4UNyV9izQzCCjT

 

  • Up0
  • Down0
v-shihha
Join Date: 7 Oct 19
Posts: 3
Posted: Tue, 2019-12-10 01:16

So... I am using debug of the snpe-net-run to try to get more insight, and I found that earier layers acturally output correct results(when comparing to fp32's), but after the first Conv + BN, data starts to make no sense.

Quote:
Result_0/MobilenetV2/Conv/Relu6:0.raw

Std: 0.04772818088531494, Variance: 0.0022779793944209814, Max Diff: 0.1740427017211914, FP32 MaxMin: 6.0,0.0 , INT8 MaxMin: 6.073394775390625,0.0
FP3232 Feature Map, INT8 Feature Map
1.635183,         1.656380
3.161371,         3.128718
0.000000,         0.000000
0.000000,         0.000000
3.476882,         3.496803
5.749937,         5.705310
0.925256,         0.920211
0.000000,         0.000000
3.390900,         3.312761
1.716338,         1.656380
1.976652,         2.024465
0.000000,         0.000000
0.000000,         0.000000
0.000000,         0.000000
0.335724,         0.368085
1.626195,         1.656380
0.000000,         0.000000
2.628080,         2.760634
0.081039,         0.184042

0.000000,         0.000000

Quote:
Result_0/MobilenetV2/Conv/BatchNorm/FusedBatchNorm:0.raw

Std: 0.05691774562001228, Variance: 0.003239629790186882, Max Diff: 0.1740427017211914, FP32 MaxMin: 12.591891288757324,-15.163686752319336 , INT8 MaxMin: 12.514873504638672,-15.09146499633789
FP3232 Feature Map, INT8 Feature Map
1.635183,         1.656380
3.161371,         3.128718
-1.441885,         -1.472338
-1.215653,         -1.288296
3.476882,         3.496803
5.749937,         5.705310
0.925256,         0.920211
-2.542784,         -2.576592
3.390900,         3.312761
1.716338,         1.656380
1.976652,         2.024465
-0.298735,         -0.368085
-0.195385,         -0.184042
-1.343657,         -1.288296
0.335724,         0.368085
1.626195,         1.656380
-0.024648,         0.000000
2.628080,         2.760634
0.081039,         0.184042

-1.308494,         -1.288296

Quote:
Result_0/MobilenetV2/Conv_1/Relu6:0.raw

Std: 1.4442336559295654, Variance: 2.085810899734497, Max Diff: 6.0, FP32 MaxMin: 6.0,0.0 , INT8 MaxMin: 6.115546226501465,0.0
FP3232 Feature Map, INT8 Feature Map
0.000000,         0.000000
5.293875,         0.000000
0.000000,         0.000000
0.000000,         0.000000
0.000000,         0.000000
0.000000,         0.000000
0.000000,         0.000000
0.000000,         1.747299
6.000000,         0.000000
0.000000,         1.164866
0.000000,         0.000000
0.000000,         0.582433
0.000000,         0.000000
2.256081,         1.164866
0.000000,         0.000000
0.000000,         0.000000
0.000000,         0.000000
1.465396,         0.000000
0.000000,         0.000000

1.923604,         0.000000

 

You can see that from the Conv_1 (after the first BN) the results no more matched with the FP32's. I am wondering is there something not working well in the quantization tool?

  • Up0
  • Down0
jesliger
Join Date: 6 Aug 13
Posts: 75
Posted: Tue, 2019-12-10 04:36

Depending on which mobilienet you use, it may not be quantization aware.  SNPE quantization tools don't address this. as quantizatoin awareness needs to be injected during training. You need to use a network that has been validated to be quantization aware.  Google has trained some with fake quantizatoin nodes to ensure it works well when run on an 8-bit runtime like the DSP.

Are you using a mobilenet model that works well in a quantized runtime?  Validate your model by running it fully quantized in tensorflow (quantized nodes, quantize weights, quantized activations.)

https://www.tensorflow.org/lite/performance/model_optimization
https://github.com/tensorflow/tensorflow/tree/r1.13/tensorflow/contrib/quantize

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/models/image_classification/overview.md

 

 

  • Up0
  • Down0
v-shihha
Join Date: 7 Oct 19
Posts: 3
Posted: Wed, 2019-12-11 03:52

Hi jesliger, thinks for your reply.

I have tried to use the TFlite quantizer & run on fully quantized model. The result is -> Top-1 Accu: 0.4774, Top-5 Accu: 0.7212, which still preserved some accurancy.

 

I can accept like 20~30% accuracy loss, and according to here->https://www.tensorflow.org/lite/performance/model_optimization, Top-1 Accu for post training quantization should be 0.637,  which still preserved a decent amount of accu. 

From my case, the Top-1 accu drop to 0.01348, which is just a little bit more than random outputs, since the output of the first two nodes are pretty close to FP32's, shows that the data pre-processing and initial std/mean extraction are working pretty well, I am wondering is that possible the algorithm or the SDK have some bugs in it?

 

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.