Snapdragon Neural Processing Engine SDK
Reference Guide

SNPE supports the network layer types listed in the table below.
See Limitations for details on the limitations and constraints for the supported runtimes and individual layer types.
All of supported layers in GPU runtime are valid for both of GPU modes: GPU_FLOAT32_16_HYBRID and GPU_FLOAT16.
GPU_FLOAT32_16_HYBRID  data storage is done in half float and computation is done in full float.
GPU_FLOAT16  both data storage and computation is done in half float.
A list of supported ONNX operations can be found at ONNX Operator Support.
inputLayer Type  Description  Caffe Equivalent  Caffe2 Equivalent  TensorFlow Equivalent  TFLite Equivalent  Onnx Equivalent  PyTorch Equivalent  CPU  GPU  DSP/ AIP 

ArgMax  Returns the index with the largest value across axes of a tensor.  n/a  n/a  ArgMax  n/a  ✔  ✔  ✔ (?)  
Batch normalization (+ Scaling)  Batch normalization (optionally followed by scaling operation).  Maps to the combination of batch_norm_layer followed immediately by scale_layer. (?) batch_norm_layer.cpp scale_layer.cpp  spatial_batch_norm_op.cc  batch_normalization  BatchNormalization  torch.nn.BatchNorm2d  ✔  ✔  ✔ (?)  
Channel Shuffle  Interleaves the channels in groups. The number of channels must be divisible by the number of groups. At least 4 channels are required for this layer to have any effect.  n/a  channel_shuffle_op.h  n/a  n/a  n/a  torch.nn.PixelShuffle  ✔  ✔  ✔ 
Color space conversion  Converts input image color format (encoding type) into SNPE native color space. Color space conversion parameters are provided as an option to the model converter tool.  There is no such Caffe layer by itself. This functionality is technically part of the Caffe data provider. data_layer.cpp  n/a  n/a  n/a  n/a  n/a  ✔  ✔  ✔ 
Concatenation  This layer concatenates multiple inputs into a single output.  concat_layer.cpp  concat_split_op.cc  concat  Concatenation  Concat  torch.cat  ✔  ✔  ✔ 
Convolution  Computes dot products between the entries of the filter and the input at any position.  Includes support for groups and dilation. conv_layer.cpp  Includes support for spatial reflection padding. conv_op.cc  conv2d  conv_2d  Conv  torch.nn.Conv2d  ✔  ✔  ✔ 
Crop  Crops one layer to the dimensions of a reference layer.  crop_layer.cpp  utility_ops.cc  n/a  ✔  ✔  ✔  
CropAndResize  Crops normalized regions from a batch of images and scales them to a desired output size.  crop_and_resize  n/a  ✔  ✘  ✘  
CrossMap Response Normalization  This is an option within LRN layer.  lrn_layer.cpp  n/a  local_response_normalization  n/a  LRN  ✔  ✔  ✔  
Deconvolution  Performs deconvolution operation.  deconv_layer.cpp  conv_transpose_op.cc  conv2d_transpose  transpose_conv  ConvTranspose  torch.nn.ConvTranspose2d  ✔  ✔  ✔ 
Depthwise Convolution  Performs a 2D depthwise convolution.  Equivalent to Convolution with 'num_output' = input channels and 'group' = 'num_output' conv_layer.cpp  Equivalent to Convolution with 'num_output' = input channels and 'group' = 'num_output' conv_op.cc  tf.nn.depthwise_conv2d  tfl.depthwise_conv_2d  n/a  n/a  ✔  ✔  ✔ 
Detection Output  Generate the detection output based on location and confidence predictions by doing non maximum suppression. Typically used in SSD networks.  Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe. detection_output_layer.cpp  n/a  n/a  n/a  n/a  n/a  ✔  ✘  ✔ 
Dropout  Layer is used for training only. Converters remove this layer from DLC creation.  dropout_layer.cpp  dropout_op.cc  dropout  n/a  Dropout  torch.nn.Dropout  n/a  n/a  n/a 
Elementwise  Supports SUM, DIV, PROD, MAX, MIN, and SUB mode with coefficients.  Support for MAX, SUM, and PROD eltwise_layer.cpp  Support for SUM and MAX utility_ops.cc  add add_n mul maximum minimum subtract  add sum div mul maximum minimum sub  Add Mul Max Min Sub Sum  torch.sum torch.mul torch.max torch.min  ✔  ✔  ✔ 
Elementwise Unary  Supports ABS, CEIL, EXP, FLOOR, LOG, NEG, ROUND, SIN, and SQRT.  floor sqrt  abs exp floor sqrt  Abs Ceil Exp Floor Log Neg Round Sin Sqrt  ✔  ✔  ✔  
Elu  activation function: elu [ i.e., f(x) = max(0,x) + a*(exp(min(0,x))1) ]  elu_layer.cpp  elu_op.cc  elu  elu  ✔  ✔  ✘  
Flatten  Flatten an input to a layer  flatten_layer.cpp  utility_ops.cc  flatten  Flatten  torch.flatten  ✔  ✔  ✔  
Fully connected  Similar to convolution, but with connections to full input region, i.e., with filter size being exactly the size of the input volume.  inner_product_layer.cpp  fully_connected_op.cc  dense Tensordot  fully_connected  MatMul  torch.nn.Linear  ✔  ✔  ✔ 
HeatmapMaxKeypoint  Keypoint detection  n/a  n/a  n/a  n/a  n/a  ✔  ✔  ✘  
Input  This is an input layer to the network.  input_layer.cpp  input  n/a  n/a  n/a  ✔  ✔  ✔  
InstanceNorm  Instance normalization  Supported as batch_norm_layer with 'use_global_stats' = false.  instance_norm_op.cc  InstanceNormalization  ✔  ✔  ✔  
L2Norm  L2 normalization along innermost dimension.  n/a  n/a  L2 Normalize  ✔  ✔  ✘  
Local Response Normalization (LRN)  Performs a lateral inhibition by normalizing over local input regions.  lrn_layer.cpp  local_response_normalization_op.cc  LRN  ✔  ✔  ✔  
LSTM  LSTM recurrent network cell  lstm_layer.cpp  tf.contrib.rnn.BasicLSTMCell tf.nn.static_rnn  LSTM  ✔  ✔  ✔  
Normalize  Instance normalization using RMS instead of mean/variance.  Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe.  n/a  n/a  n/a  ✔  ✔  ✘  
Output  There is no explicit output layer as the results from any layer in the network can be specified as an output when loading a network.  n/a  n/a  n/a  n/a  n/a  n/a  n/a  n/a  n/a 
Pack  Packs a list of tensors of rank "r" into a single rank (r+1) tensor.  n/a  n/a  stack  n/a  n/a  ✔  ✘  ✔  
Pad  Performs padding on the input tensor on the edges, in any or all dimensions as specified. values can be specified ("CONSTANT" mode in tf) or using mirror padding ("REFLECT" mode in tf)  n/a  n/a  pad  pad  n/a  torch.nn.functional.pad  ✔  ✘  ✔ 
Permute  Permute is used to rearrange the dimensions of a tensor.  Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe. permute_layer.cpp  n/a  transpose  transpose  Transpose  torch.transpose  ✔  ✔  ✔ 
Pooling  Pooling operation down samples the input volume spatially. Both average and max pooling are supported.  pooling_layer.cpp  pool_op.cc  average_pooling2d max_pooling2d  average_pool_2d max_pool_2d  MaxPool AveragePool GlobalMaxPool GlobalAveragePool  torch.nn.MaxPool2d torch.nn.AvgPool2d torch.nn.AdaptiveAvgPool2d  ✔  ✔  ✔ 
Power  Power layer computes (shift + scale * x) ^ power for input x. (?)  power_layer.cpp  n/a  n/a  n/a  n/a  ✔  ✔  ✔ (?)  
Prelu  activation function: prelu [ i.e., y = max(0, x) + a*min(0,x) ]  prelu_layer.cpp  prelu_op.cc  PReLU  PRelu  ✔  ✔  ✔  
Prior Box  Generate the prior boxes of designated sizes and aspect ratios. Typically used in SSD networks. This layer is handled (folded in) and removed from the DLC model during conversion.  Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe. prior_box_layer.cpp  n/a  n/a  n/a  n/a  n/a  n/a  n/a  n/a 
Proposal  Outputs region proposals, usually for consumption of an ROIPooling layer. Typically used in Faster RCNN.  Note that this layer is not available on the tip of Caffe. It requires a compatible branch of Caffe. proposal_layer.py  n/a  n/a  n/a  n/a  ✔  ✘  ✔  
Relu  activation function: relu [ i.e., y = max(0,x) ]  relu_layer.cpp  relu_op.cc  relu  relu  Relu  torch.nn.ReLU  ✔  ✔  ✔ 
Reshape  Change dimensions of the input to a layer  reshape_layer.cpp  utility_ops.cc  reshape  reshape  Reshape  torch.reshape  ✔  ✔  ✔ 
Sigmoid  activation function: sigmoid [ i.e., y = 1/(1 + exp(x) ]  sigmoid_layer.cpp  sigmoid_op.cc  sigmoid  Sigmoid  ✔  ✔  ✔  
Scale (Image)  Input image scaling, maintains aspect ratio. This function is primarily intended for images, but technically any 2D input data can be processed if it makes sense. Scaling parameters are provided as an option to the model converter tool.  There is no such Caffe layer by itself. This functionality is technically part of the Caffe data provider. data_layer.cpp  Resize Nearest Neighbor (Not supported on DSP) resize_op.cc  Resize Bilinear (does not support align_corners=True) tf.image.resize_bilinear Resize Nearest Neighbor (Not supported on DSP) tf.image.resize_nearest_neighbor  resize_bilinear resize_nearest_neighbor  n/a  torch.nn.UpsamplingBilinear2d  ✔  ✔  ✔ 
Scale  Elementwise scale, optionally add biases.  scale_layer.cpp  n/a  n/a  n/a  n/a  n/a  ✔  ✘  ✔ 
Silence  Silence is handled and removed from the model during conversion, similar to Dropout.  silence_layer.cpp  n/a  n/a  n/a  n/a  n/a  n/a  n/a  n/a 
Slice  Slices an input layer into multiple output layers.  slice_layer.cpp  concat_split_op.cc  split  Slice  ✔  ✔  ✔  
Softmax  Supports 1D and 2D modes.  softmax_layer.cpp  softmax_op.cc  softmax  softmax  Softmax  torch.nn.Softmax  ✔  ✔  ✔ 
Space to Depth  Rearranges blocks of spatial data into depth.  n/a  n/a  tf.nn.space_to_depth  n/a  ✔  ✘  ✔  
Strided Slice  Extracts a slice of size (endbegin)/stride from the given input_tensor. Starting at the location specified by begin the slice continues by adding stride to the index until all dimensions are not less than end. Note that a stride can be negative, which causes a reverse slice.  n/a  n/a  tf.strided_slice  n/a  ✔  ✔  ✔  
Tanh  activation function: tanh [ i.e., y = tanh(x) ]  tanh_layer.cpp  tanh_op.cc  tanh  tanh  Tanh  torch.nn.Tanh  ✔  ✔  ✔ 
Tile  Copies a blob along the specified dimensions  tile_layer.cpp  tile_op.cc  n/a  ✔  ✔  ✔  
Unpack  Unpacks "n" tensors from a single packed tensor splitting along the axis dimension.  n/a  n/a  unstack  n/a  n/a  ✔  ✘  ✔  
Cross Correlation Layer  Computes dot products between the entries of the filter and the input at any position and then rotates the result by 180 degrees.  correlation_layer.cpp  n/a  n/a  n/a  n/a  n/a  ✔  ✔  ✘ 
Embedding Layer  Turns positive integers (indexes) into dense vectors of fixed size.  n/a  n/a  embedding_layer  n/a  ✔  ✘  ✘  
ExtractGlimpse  Returns a set of windows called glimpses extracted at location offsets from the input tensor.  n/a  n/a  extract_glimpse  n/a  n/a  n/a  ✔  ✘  ✘ 
Gather  Gather slices from params axis according to indices.  n/a  n/a  gather  Gather  ✔  ✘  ✔  
Image Projective Transform  Applies the projective transform to the image represented by a tensor of shape (num_images, num_rows, num_columns, num_channels).  n/a  n/a  image_projective_transform  n/a  n/a  n/a  ✔  ✘  ✔ 
Moments  Calculate the mean and variance of input tensor.  n/a  n/a  moments  n/a  n/a  n/a  ✔  ✘  ✔ 
Depth To Space  Rearranges data from depth into blocks of spatial data. This is the reverse transformation of SpaceToDepth.  n/a  n/a  depth_to_space  depth_to_space  n/a  torch.nn.PixelShuffle  ✔  ✘  ✔ 
Bbox Transform  Transform proposal bounding boxes to target bounding box using bounding box regression deltas.  n/a  bbox_transform_op.cc  n/a  n/a  n/a  n/a  ✘  ✘  ✔ 
Reduce (max, min, sum, product, mean)  Computes the maximum, minimum, sum, product, mean of elements across dimensions of a tensor.  n/a  n/a  reduce_max reduce_min reduce_sum reduce_prod reduce_mean  n/a  ✔  ✔  ✔ 
Note : AIP Runtime supports all layers supported by the DSP runtime, as layers not supported by HTA run on HVX.