Forums - Snpe-run

8 posts / 0 new
Last post
Snpe-run
danweil24
Join Date: 17 Nov 21
Posts: 12
Posted: Tue, 2023-05-02 01:17

I'm currently working with HDK 8350. By performing quantization on identical images, I observed that the model's runtime was slower by 2-3 ms when using SNPE-2.5, i tried to use uint_8 as well using other SNPE versions, but still saw the time issue. Also when using snpe-net-run I have noticed a big difference in later SNPE versions between avrage_total_inference_time and avg_foward_propagate_time.Do you know if this is a recognized problem? 

 

 

  • Up0
  • Down0
weihuan
Join Date: 12 Apr 20
Posts: 270
Posted: Fri, 2023-05-05 08:33

Dear developer,

How do we understand the time gap for your mentioned about 2~3ms on SNPE2.5?

You can specify the proile_level==moderate to check the accelator and NetRun time.

BR.

Wei

  • Up0
  • Down0
danweil24
Join Date: 17 Nov 21
Posts: 12
Posted: Sun, 2023-05-21 04:42

Dear Weihuan, 

I used snpe-net-run and got those measures, the same network with snpe 2.5 has 2ms different bettwen  avrage_total_inference_time and NetRun,  unlike with snpe 1.5 . 

  • Up0
  • Down0
yunxqin
Join Date: 2 Mar 23
Posts: 44
Posted: Sun, 2023-06-04 08:52

Dear customer,

Could you please share the model you used via github so that I can run it on the device to analyze the cause of this problem.

BR.

Yunxiang

 

 

  • Up0
  • Down0
danweil24
Join Date: 17 Nov 21
Posts: 12
Posted: Mon, 2023-06-05 01:50

Dear Yunxiang 

I use YOLOX-S (just onnx version to dlc) 

https://github.com/Megvii-BaseDetection/YOLOX

Br,

Dan 

  • Up0
  • Down0
yunxqin
Join Date: 2 Mar 23
Posts: 44
Posted: Mon, 2023-06-05 22:45

Dear customer,

  • I have run this model on 8350 using snpe2.5 and get the avg_total_inference_time is 8815us,is this data similar to your data?

  • Do you say that snpe-v1.50 is used for the data with a big gap, we don't have snpe1.5. Looking forward to your reply.

  • BR.

  • Yunxiang

  • Up0
  • Down0
danweil24
Join Date: 17 Nov 21
Posts: 12
Posted: Sun, 2023-06-11 01:42

Dear,Yunxiang. 

I got different results, using snpe-net-run around 33ms. ,

Those are the commands that we used to convert, quantize and run the Yoloxs model with the results from snpe-net-run : 

Commands: 

Converter Command: snpe-onnx-to-dlc adjust_nms_features_dims=True align_matmul_ranks=True batch=None copyright_file=None custom_io= custom_op_config_paths=None debug=-1 define_symbol=None disable_batchnorm_folding=False dry_run=None dumpIR=False dump_custom_io_config_template= dump_inferred_model=False dump_value_info=False enable_strict_validation=False expand_gru_op_structure=True extract_color_transform=True float_bw=32 force_prune_cast_ops=False handle_gather_negative_indices=True inject_cast_for_gather=True input_dim=[['images', '1,3,640,640']] input_dtype=[] input_encoding=[] input_layout=[] input_type=[] keep_disconnected_nodes=False keep_int64_inputs=False keep_quant_nodes=False match_caffe_ssd_to_tf=True model_version=yolox_s no_simplification=False op_package_lib= out_names=['output'] package_name=None perform_axes_to_spatial_first_order=True prepare_inputs_as_params=False preprocess_lstm_ops=True preprocess_roi_pool_inputs=True quantization_overrides= squash_box_decoder=True unroll_gru_time_steps=True unroll_lstm_time_steps=True use_convert_quantization_nodes=False validation_target=[]
Custom Model Version: yolox_s
Model Copyright: N/A
Quantizer Command: snpe-dlc-quant help=false version=false verbose=false quiet=false silent=false debug=[] debug1=false debug2=false debug3=true log-mask=[] log-file=[] log-dir=[] log-file-include-hostname=false input_dlc=[/mnt/H2/detectors/yolox3/device_pipeline/models/yolox_s/yolox_s.dlc] input_list=[/mnt/H2/detectors/yolox3/device_pipeline/inputs/input_list_files/yolox_s/server_input_list.txt] no_weight_quantization=false output_dlc=[/mnt/H2/detectors/yolox3/device_pipeline/models/yolox_s/yolox_s_quantized.dlc] use_enhanced_quantizer=true use_adjusted_weights_quantizer=false optimizations=[] override_params=false use_encoding_optimizations=false udo_package_path=[] use_symmetric_quantize_weights=false use_native_dtype=false bitwidth=[] weights_bitwidth=[] act_bitwidth=[] float_bitwidth=[]
bias_bitwidth=[] clip_alpha=[] axis_quant=false
Run Command: snpe-net-run --container yolox_s_quant_2_5.dlc --input_list input_list.txt --use_dsp
 
Results from csv output from snpe-net-run:
 
AVG_FORWARD_PROPAGATE_TIME=31622us
 
AVG_TOTAL_INFERENCE_TIME = 33438us
 
33438

BR,Dan

  • Up0
  • Down0
yunxqin
Join Date: 2 Mar 23
Posts: 44
Posted: Thu, 2023-06-15 20:06

Dear customer,

You can use 'snpe-net-run --perf_profile burst'  to accelerate running time.

Br.

Yunxiang

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.