Snapdragon Neural Processing Engine SDK
Reference Guide
Supported Network Layers

Supported Network Layers

SNPE supports the network layer types listed in the table below.

See Limitations for details on the limitations and constraints for the supported runtimes and individual layer types.

All of supported layers in GPU runtime are valid for both of GPU modes: GPU_FLOAT32_16_HYBRID and GPU_FLOAT16.
GPU_FLOAT32_16_HYBRID - data storage is done in half float and computation is done in full float.
GPU_FLOAT16 - both data storage and computation is done in half float.

A list of supported ONNX operations can be found at ONNX Operator Support.

Note: this table is outdated and does not reflect the current state of supported layers/backends.

input
Layer Type Description Caffe
Equivalent
Caffe2
Equivalent
TensorFlow
Equivalent
TFLite
Equivalent
Onnx
Equivalent
PyTorch
Equivalent
CPU GPU DSP/
AIP
ArgMax Returns the index with the largest value across axes of a tensor. n/a n/a ArgMax n/a
(?)
Batch normalization (+ Scaling) Batch normalization (optionally followed by scaling operation). Maps to the combination of batch_norm_layer followed immediately by scale_layer. (?)
batch_norm_layer.cpp
scale_layer.cpp
spatial_batch_norm_op.cc batch_normalization BatchNormalization torch.nn.BatchNorm2d
(?)
Channel Shuffle Interleaves the channels in groups. The number of channels must be divisible by the number of groups. At least 4 channels are required for this layer to have any effect. n/a channel_shuffle_op.h n/a n/a n/a torch.nn.PixelShuffle
Color space conversion Converts input image color format (encoding type) into SNPE native color space. Color space conversion parameters are provided as an option to the model converter tool. There is no such Caffe layer by itself. This functionality is technically part of the Caffe data provider.
data_layer.cpp
n/a n/a n/a n/a n/a
Concatenation This layer concatenates multiple inputs into a single output. concat_layer.cpp concat_split_op.cc concat Concatenation Concat torch.cat
Convolution Computes dot products between the entries of the filter and the input at any position. Includes support for groups and dilation.
conv_layer.cpp
Includes support for spatial reflection padding.
conv_op.cc
conv2d conv_2d Conv torch.nn.Conv2d
Crop Crops one layer to the dimensions of a reference layer. crop_layer.cpp utility_ops.cc n/a
CropAndResize Crops normalized regions from a batch of images and scales them to a desired output size. crop_and_resize n/a
CrossMap Response Normalization This is an option within LRN layer. lrn_layer.cpp n/a local_response_normalization n/a LRN
Deconvolution Performs deconvolution operation. deconv_layer.cpp conv_transpose_op.cc conv2d_transpose transpose_conv ConvTranspose torch.nn.ConvTranspose2d
Depthwise Convolution Performs a 2D depthwise convolution. Equivalent to Convolution with 'num_output' = input channels and 'group' = 'num_output'
conv_layer.cpp
Equivalent to Convolution with 'num_output' = input channels and 'group' = 'num_output'
conv_op.cc
tf.nn.depthwise_conv2d tfl.depthwise_conv_2d n/a n/a
Detection Output Generate the detection output based on location and confidence predictions by doing non maximum suppression.
Typically used in SSD networks.
Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe.
detection_output_layer.cpp
n/a n/a n/a n/a n/a
Dropout Layer is used for training only. Converters remove this layer from DLC creation. dropout_layer.cpp dropout_op.cc dropout n/a Dropout torch.nn.Dropout n/a n/a n/a
Elementwise Supports SUM, DIV, PROD, MAX, MIN, and SUB mode with coefficients. Support for MAX, SUM, and PROD
eltwise_layer.cpp
Support for SUM and MAX
utility_ops.cc
add
add_n
mul
maximum
minimum
subtract
add
sum
div
mul
maximum
minimum
sub
Add
Mul
Max
Min
Sub
Sum
torch.sum
torch.mul
torch.max
torch.min
Elementwise Unary Supports ABS, CEIL, EXP, FLOOR, LOG, NEG, ROUND, SIN, and SQRT. floor
sqrt
abs
exp
floor
sqrt
Abs
Ceil
Exp
Floor
Log
Neg
Round
Sin
Sqrt
Elu activation function: elu [ i.e., f(x) = max(0,x) + a*(exp(min(0,x))-1) ] elu_layer.cpp elu_op.cc elu elu
Flatten Flatten an input to a layer flatten_layer.cpp utility_ops.cc flatten Flatten torch.flatten
Fully connected Similar to convolution, but with connections to full input region, i.e., with filter size being exactly the size of the input volume. inner_product_layer.cpp fully_connected_op.cc dense
Tensordot
fully_connected MatMul torch.nn.Linear
HeatmapMaxKeypoint Keypoint detection n/a n/a n/a n/a n/a
Input This is an input layer to the network. input_layer.cpp input n/a n/a n/a
InstanceNorm Instance normalization Supported as batch_norm_layer with 'use_global_stats' = false. instance_norm_op.cc InstanceNormalization
L2Norm L2 normalization along innermost dimension. n/a n/a L2 Normalize
Local Response Normalization (LRN) Performs a lateral inhibition by normalizing over local input regions. lrn_layer.cpp local_response_normalization_op.cc LRN
LSTM LSTM recurrent network cell lstm_layer.cpp tf.contrib.rnn.BasicLSTMCell
tf.nn.static_rnn
LSTM
Normalize Instance normalization using RMS instead of mean/variance. Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe. n/a n/a n/a
Output There is no explicit output layer as the results from any layer in the network can be specified as an output when loading a network. n/a n/a n/a n/a n/a n/a n/a n/a n/a
Pack Packs a list of tensors of rank "r" into a single rank (r+1) tensor. n/a n/a stack n/a n/a
Pad Performs padding on the input tensor on the edges, in any or all dimensions as specified. values can be specified ("CONSTANT" mode in tf) or using mirror padding ("REFLECT" mode in tf) n/a n/a pad pad n/a torch.nn.functional.pad
Permute Permute is used to rearrange the dimensions of a tensor. Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe.
permute_layer.cpp
n/a transpose transpose Transpose torch.transpose
Pooling Pooling operation down samples the input volume spatially. Both average and max pooling are supported. pooling_layer.cpp pool_op.cc average_pooling2d
max_pooling2d
average_pool_2d
max_pool_2d
MaxPool
AveragePool
GlobalMaxPool
GlobalAveragePool
torch.nn.MaxPool2d
torch.nn.AvgPool2d
torch.nn.AdaptiveAvgPool2d
Power Power layer computes (shift + scale * x) ^ power for input x. (?) power_layer.cpp n/a n/a n/a n/a (?)
Prelu activation function: prelu [ i.e., y = max(0, x) + a*min(0,x) ] prelu_layer.cpp prelu_op.cc PReLU PRelu
Prior Box Generate the prior boxes of designated sizes and aspect ratios. Typically used in SSD networks.
This layer is handled (folded in) and removed from the DLC model during conversion.
Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe.
prior_box_layer.cpp
n/a n/a n/a n/a n/a n/a n/a n/a
Proposal Outputs region proposals, usually for consumption of an ROIPooling layer. Typically used in Faster RCNN. Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe.
proposal_layer.py
n/a n/a n/a n/a
Relu activation function: relu [ i.e., y = max(0,x) ] relu_layer.cpp relu_op.cc relu relu Relu torch.nn.ReLU
Reshape Change dimensions of the input to a layer reshape_layer.cpp utility_ops.cc reshape reshape Reshape torch.reshape
Sigmoid activation function: sigmoid [ i.e., y = 1/(1 + exp(-x) ] sigmoid_layer.cpp sigmoid_op.cc sigmoid Sigmoid
Scale (Image) Input image scaling, maintains aspect ratio. This function is primarily intended for images, but technically any 2D input data can be processed if it makes sense. Scaling parameters are provided as an option to the model converter tool. There is no such Caffe layer by itself. This functionality is technically part of the Caffe data provider.
data_layer.cpp
Resize Nearest Neighbor (Not supported on DSP)
resize_op.cc
Resize Bilinear (does not support align_corners=True) tf.image.resize_bilinear
Resize Nearest Neighbor (Not supported on DSP) tf.image.resize_nearest_neighbor
resize_bilinear
resize_nearest_neighbor
n/a torch.nn.UpsamplingBilinear2d
Scale Elementwise scale, optionally add biases. scale_layer.cpp n/a n/a n/a n/a n/a
Silence Silence is handled and removed from the model during conversion, similar to Dropout. silence_layer.cpp n/a n/a n/a n/a n/a n/a n/a n/a
Slice Slices an input layer into multiple output layers. slice_layer.cpp concat_split_op.cc split Slice
Softmax Supports 1D and 2D modes. softmax_layer.cpp softmax_op.cc softmax softmax Softmax torch.nn.Softmax
Space to Depth Rearranges blocks of spatial data into depth. n/a n/a tf.nn.space_to_depth n/a
Strided Slice Extracts a slice of size (end-begin)/stride from the given input_tensor. Starting at the location specified by begin the slice continues by adding stride to the index until all dimensions are not less than end. Note that a stride can be negative, which causes a reverse slice. n/a n/a tf.strided_slice n/a
Tanh activation function: tanh [ i.e., y = tanh(x) ] tanh_layer.cpp tanh_op.cc tanh tanh Tanh torch.nn.Tanh
Tile Copies a blob along the specified dimensions tile_layer.cpp tile_op.cc n/a
Unpack Unpacks "n" tensors from a single packed tensor splitting along the axis dimension. n/a n/a unstack n/a n/a
Cross Correlation Layer Computes dot products between the entries of the filter and the input at any position and then rotates the result by 180 degrees. correlation_layer.cpp n/a n/a n/a n/a n/a
Embedding Layer Turns positive integers (indexes) into dense vectors of fixed size. n/a n/a embedding_layer n/a
ExtractGlimpse Returns a set of windows called glimpses extracted at location offsets from the input tensor. n/a n/a extract_glimpse n/a n/a n/a
Gather Gather slices from params axis according to indices. n/a n/a gather Gather
Image Projective Transform Applies the projective transform to the image represented by a tensor of shape (num_images, num_rows, num_columns, num_channels). n/a n/a image_projective_transform n/a n/a n/a
Moments Calculate the mean and variance of input tensor. n/a n/a moments n/a n/a n/a
Depth To Space Rearranges data from depth into blocks of spatial data. This is the reverse transformation of SpaceToDepth. n/a n/a depth_to_space depth_to_space n/a torch.nn.PixelShuffle
Bbox Transform Transform proposal bounding boxes to target bounding box using bounding box regression deltas. n/a bbox_transform_op.cc n/a n/a n/a n/a
Reduce (max, min, sum, product, mean) Computes the maximum, minimum, sum, product, mean of elements across dimensions of a tensor. n/a n/a reduce_max
reduce_min
reduce_sum
reduce_prod
reduce_mean
n/a

Note : AIP Runtime supports all layers supported by the DSP runtime, as layers not supported by HTA run on HVX.