Forums - x86 snpe-sample error

16 posts / 0 new
Last post
x86 snpe-sample error
xavier12358
Join Date: 11 Dec 16
Posts: 20
Posted: Wed, 2018-03-14 12:53
 
Hello,
 
I try to run the snpe_sample compiled on my computer with the SSD  but I get that error:
 
Starting program: /home/xavier/Desktop/developpement/snpe-1.13.0/examples/NativeCpp/SampleCode/obj/local/x86_64-linux-clang/snpe-sample -b ITENSOR -d /home/xavier/Desktop/developpement/snpe-1.13.0/mobilenet_ssd.dlc -i /home/xavier/Desktop/developpement/snpe-1.13.0/models/SSD/data/cropped/raw_list.txt -o /home/xavier/Desktop/developpement/snpe-1.13.0/examples/NativeCpp/SampleCode/output/
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff58af700 (LWP 18416)]
[New Thread 0x7fffeffff700 (LWP 18417)]
[New Thread 0x7ffff50ae700 (LWP 18418)]
[New Thread 0x7ffff48ad700 (LWP 18419)]
SNPE Version: 1.13.0.0
 
Thread 1 "snpe-sample" received signal SIGSEGV, Segmentation fault.
__memcpy_avx_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:100
100 ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S: No such file or directory.
 
But even in debug mode I don't have many debug info. The dlc model works properly with the executable snpe-run-net so I don't think it comes from the model file.
 
More information from the ldd command:
 
ldd ./obj/local/x86_64-linux-clang/snpe-sample 
linux-vdso.so.1 =>  (0x00007fff4f5d4000)
libSNPE.so => /home/xavier/Desktop/developpement/snpe-1.13.0/lib/x86_64-linux-clang/libSNPE.so (0x00007fa03207c000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fa031cfa000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fa031ae4000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa03171a000)
libsymphony-cpu.so => /home/xavier/Desktop/developpement/snpe-1.13.0/lib/x86_64-linux-clang/libsymphony-cpu.so (0x00007fa030c21000)
libatomic.so.1 => /usr/lib/x86_64-linux-gnu/libatomic.so.1 (0x00007fa030a19000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa030815000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa0305f8000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fa0302ef000)
/lib64/ld-linux-x86-64.so.2 (0x00007fa03260e000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fa0300e7000)
 
 
 
  • Up0
  • Down0
xavier12358
Join Date: 11 Dec 16
Posts: 20
Posted: Thu, 2018-03-15 01:43

I also try to use the model on my qualcomm 820. The CPU mode works properly but the gpu and the DSP crash:

 /bin/aarch64-android-clang3.8/snpe-net-run-g --container ./mobilenet_ssd.dlc  *

error_code=605; error_message=CPU network not supported Null Neural Network CPU instance requested - this is invalid, No CPU implementation found; error_component=CPU Runtime; line_no=40; thread_id=548076550808
/bin/aarch64-android-clang3.8/snpe-net-run  --use_gpu  --container ./mobilenet*
error_code=802; error_message=Layer parameter value is invalid in GPU. Layer BoxPredictor_0/Reshape_1:0 : output width = 1083, depth = 91 width * depth (packed) = 24909 exceeds maximum image width 16384 for Adreno A530; error_component=GPU Runtime; line_no=396; thread_id=547904776856
/bin/aarch64-android-clang3.8/snpe-net-run  --use_dsp  --container ./mobilenet*
The selected runtime is not available on this platform. Continue anyway to observe the failure at network creation time.
error_code=101; error_message=Invalid parameter in user config. Attempted to set a neural network configuration option DSP that is not supported on this platform.; error_component=System Configuration; line_no=107; thread_id=547783043736
  • Up0
  • Down0
xavier12358
Join Date: 11 Dec 16
Posts: 20
Posted: Fri, 2018-03-16 01:56

Just to update in case people are interested in the post.

I forget to  set the ADSP_LIBRARY_PATH

 export ADSP_LIBRARY_PATH="/data/local/tmp/lib/dsp/aarch64-android-gcc4.9;/system/lib/rfsa/adsp;/system/vendor/lib/rfsa/adsp;/dsp/;/data/local/tmp/adsp/"

Now I get that error, the network don't seems to be runnable:

error_code=1000; error_message=Layer is not supported. Layer FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/BatchNorm/batchnorm/mul/_99__cf__102:0 of type Constant not supported by DSP runtime; error_component=Model Validation; line_no=249; thread_id=-325982924
 

Just to summury:

In CPU : OK

-> time with DLC model ~ 1100s  

->time with tensorflow model and API 300ms 

In GPU : 

error_code=802; error_message=Layer parameter value is invalid in GPU. Layer BoxPredictor_0/Reshape_1:0 : output width = 1083, depth = 91 width * depth (packed) = 24909 exceeds maximum image width 16384 for Adreno A530; error_component=GPU Runtime; line_no=396; thread_id=547904776856

In DSP

error_code=1000; error_message=Layer is not supported. Layer FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/BatchNorm/batchnorm/mul/_99__cf__102:0 of type Constant not supported by DSP runtime; error_component=Model Validation; line_no=249; thread_id=-325982924

Can someone from Qualcomm help me?

 

 

  • Up0
  • Down0
jesliger
Join Date: 6 Aug 13
Posts: 75
Posted: Fri, 2018-03-16 05:41

MobilenetSSD is not supported on DSP.

Some of the layers are not supported on GPU.  You need to enable the CPU fallback feature.  When you do this, layers that can run on GPU will stay on GPU, while the unsupported ones will run on CPU instead.  Run snpe-net-run -h to find out how to enable it.  If you are writing your own app, it's also described in the API documentation.

  • Up0
  • Down0
xavier12358
Join Date: 11 Dec 16
Posts: 20
Posted: Fri, 2018-03-16 06:39

Thank you for the answer. In the SNPEDiag file I get that information:

Average SNPE Statistics:
------------------------------
Total Inference Time: 71088 us
Forward Propagate Time: 70758 us
What are the  Total Inference Time and Forward Propagate Time ?
  • Up0
  • Down0
jesliger
Join Date: 6 Aug 13
Posts: 75
Posted: Fri, 2018-03-16 07:18

Those times are documented in the SNPE user's guide, in the benchmarking chapter.

  • Up0
  • Down0
xavier12358
Join Date: 11 Dec 16
Posts: 20
Posted: Fri, 2018-03-16 09:37
I dont really understand thé documentation. Does the Total Inference Time contains the time of foward propagation?
  • Up0
  • Down0
jesliger
Join Date: 6 Aug 13
Posts: 75
Posted: Fri, 2018-03-16 10:40

Yes, Total Inference time includes forward propagate.

 

  • Up0
  • Down0
madhavajay
Join Date: 15 Mar 18
Posts: 22
Posted: Mon, 2018-03-19 08:37

Hi xavier12358,

Can you provide some feedback on the performance using the 820 chip?

Is it able to process image frames in SSD fast enough to develop on?

Is it faster with the DLC format and SDK? You mention some numbers which make it look slower than with normal TF but were they on your desktop or the 820 chip?

Just wondering what kind of performance improvement the NPE SDK is providing to the SSD model, and in particular on that cheaper 820 chip?

  • Up0
  • Down0
xavier12358
Join Date: 11 Dec 16
Posts: 20
Posted: Mon, 2018-03-19 08:48

Hello madhavajay,

On my Snapdragon 820 64 bit:

On tensorflow time to process images  is 400ms up to 600ms

On CPU with SNPE time to process images  is 1100 ms

On CPU/GPU  with SNPE time to process images  is approximatly 77ms 

On DSP, the SSD model is not compatible.

I hopes this answer your questions.

  • Up0
  • Down0
madhavajay
Join Date: 15 Mar 18
Posts: 22
Posted: Mon, 2018-03-19 11:06

Hi,

Yes that does thank you!

So it appears that with GPU support the device is quite fast however it doesnt have the DSP that the 845 has.

I have asked in another thread but not heard if theres any guarantee of support for NNAPI and TensoFlow Lite, but the device itself can handle the SSD network quite well.

  • Up0
  • Down0
madhavajay
Join Date: 15 Mar 18
Posts: 22
Posted: Mon, 2018-04-30 15:18

Hi xavier12358,

I finally got my sNPE demo running on the LG G6 (821 Qualcomm) Android 7.0.

I have tested the snpe-net-run tool on the device and the DSP demo for inceptionv3 works so the device has DSP drivers so im sure it also does the GPU accell too right?

However the performance of the SSD after conversion using the DLC tool, (i have used 1.13.0 and 1.14.1 SDKs both for the aar file and the conversion with no difference) is still really bad. The CPU is about 750ms and the GPU is about 500ms.

Its basically the same as the normal TensorFlow SSD MobileNet v1 pb demo.

What did you do to get your amazing performance?

Can you say what hardware and what version of the SDK you used and if there are any special steps or configuration options to get this:

You said you got:

On CPU/GPU  with SNPE time to process images  is approximatly 77ms 

I want that performance. :)

 

 
  • Up0
  • Down0
xavier12358
Join Date: 11 Dec 16
Posts: 20
Posted: Mon, 2018-04-30 22:57
What commands did you use?
  • Up0
  • Down0
madhavajay
Join Date: 15 Mar 18
Posts: 22
Posted: Tue, 2018-05-01 07:20
From the docs: /doc/html/convert_mobilenetssd.html
 
snpe-tensorflow-to-dlc --graph ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb -i Preprocessor/sub 300,300,3 --out_node detection_classes --out_node detection_boxes --out_node detection_scores --dlc mobilenet_ssd.dlc --allow_unconsumed_nodes
 
The file i get is 55517019 bytes
MD5 (mobilenet_ssd.dlc) = 82d2477d365cd6cbad9752aa99015402
 
I have tried this with 1.13.0 and 1.14.1 sdks, results seem indentical.
Then in Android Studio I have used:
 
NeuralNetwork.Runtime order = NeuralNetwork.Runtime.GPU;
 
String[] outputLayers = {"Postprocessor/BatchMultiClassNonMaxSuppression", "add_6"};
      d.inferenceInterface = new SNPE.NeuralNetworkBuilder(application)
              .setPerformanceProfile(NeuralNetwork.PerformanceProfile.HIGH_PERFORMANCE)
              .setRuntimeOrder(order)
              .setModel(dlcFile)
              .setCpuFallbackEnabled(true)
              .setOutputLayers(outputLayers)
              .build();
 
Everything works but its slow like 500ms on my LG G6 and 300ms on an 820 Qualcomm dev board.
 
Any ideas?
Are you able to share your dlc file and some example code?
What hardware are you using exactly?
 
I have put my file here for download if you want to try it to see if its the problem:
 
Really keen to get this working as fast as you have it!!!!!
:)
  • Up0
  • Down0
xavier12358
Join Date: 11 Dec 16
Posts: 20
Posted: Wed, 2018-05-02 01:05

In fact I don't  try the SDK integration, I just test the snpe-net-run  binary in the SDK folder:

./bin/aarch64-android-gcc4.9/snpe-net-run --use_gpu   --container ./model/mobilenet_ssd.dlc   --input_list ./data/list.txt --enable_cpu_fallback

When I execute the binary file, it generates me the results and a log file. In the log file, you can get the execution time per process.

 

 

 

 

  • Up0
  • Down0
madhavajay
Join Date: 15 Mar 18
Posts: 22
Posted: Wed, 2018-05-02 09:22

Hi xavier12358,

Can you comment on what hardware EXACTLY this was tested on?
If I can replicate your results then I can get the performance I need for a specific project.

Thanks!

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.