Forums - snpe-net-run running batched inputs failed

3 posts / 0 new
Last post
snpe-net-run running batched inputs failed
bao21987
Join Date: 28 Sep 22
Posts: 2
Posted: Tue, 2022-11-22 06:07

 

  1. I want to runing a bert model on dsp for text-pair matching task, so I need to run batch inputs.
  2.  I use snpe-dlc-quantize to quantize my model(bert), so i have to set batch_size to 1.
  3. while inferencing, i want to change the batch size using the "input_dimensions" argument

snpe-net-run --container model_encoder.dlc --input_list snpe_inputs/encoder_inputs_10.txt --output_dir snpe_output --input_name hidden_states --input_dimensions 4,1,128,312 --input_name  attention_mask --input_dimensions 4,1,128,1

and i got this error:

error_code=107; error_message=Changing of dimensions is not supported by layer. error_code=107; error_message=Changing of dimensions is not supported by layer. MatMul layer name MatMul_20; error_component=System Configuration; line_no=1577; thread_id=139909755639552; error_component=System Configuration; line_no=342; thread_id=139909756180480

the layer is self-attention  Q*K calculation:

query : [batch, num_heads,seq_len, head_size],  key: [batch, num_heads, head_size, seq_len]

MatMul(query, key)

sdk ver: 1.67.0

onnx version: 1.6.0

is there any way, i can change batch_size while using a quantized model? tks~

 

 

  • Up0
  • Down0
weihuan
Join Date: 12 Apr 20
Posts: 270
Posted: Sun, 2022-11-27 00:59

Dear developer,

Per my understanding, SNPE do not support multibatch if you quantized model in single batch. Please try to quantize your model on multibatch since SNPE will recognize your multibatch.

  • Up0
  • Down0
bao21987
Join Date: 28 Sep 22
Posts: 2
Posted: Wed, 2023-01-04 22:59

i use  snpe-dlc-quantize to quantize  my model.  and failed to quantize model with multibatch. 

the toturial says: The tool requires the batch dimension of the DLC input file to be set to 1 during the original model conversion step.

Can i convert an onnx model with batch_size=4 to dlc model, and then quantize  the dlc model? how to conduct, thank you. 

the data i use for quantize is like, every file is a tensor of batch_size=1:

"

#output1name  output2name

embeddings:=snpe_inputs/element_embeddings/0.raw attention_mask:=snpe_inputs/element_attention_mask/0.raw

embeddings:=snpe_inputs/element_embeddings/1.raw attention_mask:=snpe_inputs/element_attention_mask/1.raw

embeddings:=snpe_inputs/element_embeddings/2.raw element_attention_mask:=snpe_inputs/element_attention_mask/2.raw

"

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.