Hi, I wrote a UDL called Interp and it worked well on snpe-1.23.0.245. I used snpe-caffe-to-dlc to convert .caffemodel to .dlc and the .dlc is right.
I just saw the latest snpe version is 1.31 so I wanted to update my snpe. I rewrote the Interp UDL based on SNPE 1.31. I got error during converting the caffemodel. SNPE inserts permute layer before and after my UDL. I don't know why. So I followed the UDL Tutorial and it worked fine.
As scale UDL in SNPE UDL Tutorial is after fully connection layer so its input tensor is 2-dims. I want to do a test to make mycustomscale's input tensor is 4-dims. So I added mycustomscale layer after "pool1" layer. The prototxt is as following:
name: "LeNet"
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 1 dim: 1 dim: 28 dim: 28 } }
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "scale1"
type: "MyCustomScale"
bottom: "pool1"
top: "scale1"
scale_param {
bias_term: false
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "scale1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "scale2"
type: "MyCustomScale"
bottom: "ip1"
top: "scale2"
scale_param {
bias_term: false
}
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "scale2"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "ip2"
top: "prob"
}
I got a caffemode after training. Then convert it to dlc. The output is :
2019-11-06 13:01:46,258 - 170 - INFO - Loading UDLs from module my_udl_layers using factory function udl_factory_func
2019-11-06 13:01:46,303 - 170 - INFO - INFO_DLC_SAVE_LOCATION: Saving model at mycustomlenet.dlc
/home/zhaoy/WORK/Dev/Qualcomm_SDK/snpe-1.31.0.522/lib/python/snpe/converters/common/converter_ir/ir_to_dlc.py:212: RuntimeWarning: error_code=1002; error_message=Layer paramter value is invalid. Layer conv2: weights have wrong number of input channels; error_component=Model Validation; line_no=999; thread_id=139934198486784
node.op.groups)
/home/zhaoy/WORK/Dev/Qualcomm_SDK/snpe-1.31.0.522/lib/python/snpe/converters/common/converter_ir/ir_to_dlc.py:525: RuntimeWarning: error_code=1004; error_message=Layer parameters combination is invalid. Layer ip1: mismatch between size of input pool2.nsc (1600) and width of weight matrix (800); error_component=Model Validation; line_no=1413; thread_id=139934198486784
node.output_names[0])
2019-11-06 13:01:46,314 - 170 - INFO - INFO_CONVERSION_SUCCESS: Conversion completed successfully
The output of snpe-dlc-info -i mycustomscale.dlc is :
DLC info for: /home/zhaoy/WORK/Dev/Qualcomm_SDK/snpe-1.31.0.522/examples/Python/UdlExample.ori/mycustomlenet.dlc
Model Version: N/A
Model Copyright:N/A
-----------------------------------------------------------------------------------------------------------------------
| Id | Name | Type | Inputs | Outputs | Out Dims | Runtimes | Parameters |
-----------------------------------------------------------------------------------------------------------------------
| 0 | data | data | data | data | 1x28x28x1 | A D G C | input_preprocessing: passthrough |
| | | | | | | | input_type: default |
| 1 | conv1 | convolutional | data | conv1 | 1x24x24x20 | A D G C | padding x: 0 |
| | | | | | | | padding y: 0 |
| | | | | | | | padding mode: zero |
| | | | | | | | stride x: 1 |
| | | | | | | | stride y: 1 |
| | | | | | | | num filters: 20 |
| | | | | | | | kernel: 5x5 |
| | | | | | | | param count: 520 (0.121%) |
| | | | | | | | MACs per inference: 288k (7.36%) |
| 2 | pool1 | pooling | conv1 | pool1 | 1x12x12x20 | A D G C | pool size x: 2 |
| | | | | | | | pool size y: 2 |
| | | | | | | | stride x: 2 |
| | | | | | | | stride y: 2 |
| | | | | | | | padding x: 0 |
| | | | | | | | padding y: 0 |
| | | | | | | | pool_type: POOL_MAX |
| | | | | | | | MACs per inference: 2k (0.0736%) |
| 3 | pool1.nsc | permute | pool1 | pool1.nsc | 1x12x20x12 | A D G C | permute_order: [0L, 2L, 3L, 1L] |
| | | | | | | | MACs per inference: 11k (0.294%) |
| 4 | scale1 | user_defined | pool1.nsc | scale1 | 1x20x12x12 | A D G C | blob_size: 97 |
| 5 | conv2 | convolutional | scale1 | conv2 | 1x16x8x50 | | padding x: 0 |
| | | | | | | | padding y: 0 |
| | | | | | | | padding mode: zero |
| | | | | | | | stride x: 1 |
| | | | | | | | stride y: 1 |
| | | | | | | | num filters: 50 |
| | | | | | | | kernel: 5x5 |
| | | | | | | | param count: 25k (5.82%) |
| | | | | | | | MACs per inference: 3M (81.7%) |
| 6 | pool2 | pooling | conv2 | pool2 | 1x8x4x50 | A D G C | pool size x: 2 |
| | | | | | | | pool size y: 2 |
| | | | | | | | stride x: 2 |
| | | | | | | | stride y: 2 |
| | | | | | | | padding x: 0 |
| | | | | | | | padding y: 0 |
| | | | | | | | pool_type: POOL_MAX |
| | | | | | | | MACs per inference: 1k (0.0409%) |
| 7 | pool2.nsc | permute | pool2 | pool2.nsc | 1x4x50x8 | A D G C | permute_order: [0L, 2L, 3L, 1L] |
| | | | | | | | MACs per inference: 6k (0.163%) |
| 8 | ip1 | fully_connected | pool2.nsc | ip1 | 1x500 | | param count: 400k (92.9%) |
| | | | | | | | MACs per inference: 400k (10.2%) |
| 9 | relu1 | neuron | ip1 | relu1.ip1 | 1x500 | A D G C | a: 0 |
| | | | | | | | b: 0 |
| | | | | | | | min_clamp: 0 |
| | | | | | | | max_clamp: 0 |
| | | | | | | | func: relu |
| 10 | scale2 | user_defined | relu1.ip1 | scale2 | 1x500 | A D G C | blob_size: 2017 |
| 11 | ip2 | fully_connected | scale2 | ip2 | 1x10 | A D G C | param count: 5k (1.16%) |
| | | | | | | | MACs per inference: 5k (0.128%) |
| 12 | prob | softmax | ip2 | prob | 1x10 | A D G C | |
-----------------------------------------------------------------------------------------------------------------------
Note: The supported runtimes column assumes a processor target of Snapdragon 835 (8998)
Key : A:AIP
D:DSP
G:GPU
C:CPU
Total parameters: 430572 (1 MB assuming single precision float)
Total MACs per inference: 3M (100%)
Converter command: snpe-caffe-to-dlc udl=['my_udl_layers', 'udl_factory_func'] enable_strict_validation=False input_type=[] input_encoding=[] disable_batchnorm_folding=False caffe_bin=/home/zhaoy/Soft/caffe/examples/mnist/mycustomlenet_iter_10000.caffemodel model_version=None validation_target=[] debug=-1 copyright_file=None
DLC created with converter version: 1.31.0.522
Est. Steady-State Memory Needed to Run: 2.0 MiB
-----------------------------------------------------------------------------------------------------------------------
Note that I insert 2 udl and scale2 is the same as UDL Tutorial. From the dlc info we can see snpe automatically inserts permute op before scale1 and inserts permute op after scale1, while scale2 is normal and right. Why snpe insert permute op?
I use SNPE-1.31.0.522. The .prototxt, .caffemodel, and .dlc files are available on https://github.com/zhaoyang-star/SNPE-UDL-TEST
I am looking forword to your replay. Thanks in advance.