Snapdragon Neural Processing Engine SDK
Reference Guide
Quantizing a Model

Each of the snpe-framework-to-dlc conversion tools convert non-quantized models into a non-quantized DLC file. Quantizing requires another step. The snpe-dlc-quantize tool is used to quantize the model to one of supported fixed point formats.

For example, the following command will convert an Inception v3 DLC file into a quantized Inception v3 DLC file.

snpe-dlc-quantize --input_dlc inception_v3.dlc --input_list image_file_list.txt
                  --output_dlc inception_v3_quantized.dlc

The image list specifies paths to raw image files used for quantization. See snpe-dlc-quantize for more details.

The tool requires the batch dimension of the DLC input file to be set to 1 during model conversion. The batch dimension can be changed to a different value for inference, by resizing the network during initialization.

For details on the quantization algorithm, and information on when to use a quantized model, see Quantized vs Non-Quantized Models.

Input data for quantization

To properly calculate the ranges for the quantization parameters, a representative set of input data needs to be used as input into snpe-dlc-quantize.

Experimentation shows that providing 5-10 input data examples in the input_list for snpe-dlc-quantize is usually sufficient, and definitely practical for quick experiments. For more robust quantization results, we recommend providing 50-100 examples of representative input data for the given model use case, without using data from the training set. The representative input data set ideally should include all input data modalities which represent/produce all the output types/classes of the model, preferably with several input data examples per output type/class.

In network_models, also list the models that guarantee to be quantized successfully. Otherwise or newer models, not tried yet.

Possible problems might meet

Possible error message you might meet during running snpe-dlc-quantize.

  • width and height mismatch

    [INFO] InitializeStderr: DebugLog initialized.
    [INFO] Writing intermediate model
    [WARNING] NetworkTopology::populateNetworkDesc network desc inputs is empty. Does this network have data/input layer?
    [ERROR] Input file <raw file path> has inappropriate number of bytes (1072812) for non-NV21 input for batch 1 and size {1, 224, 224, 3}.
    Expected bytes 602112
    [INFO] DebugLog shutting down.

    That means the size of dimension in the picture elements in your image_file_list.txt mismatch the model requirement. In this case, the model wants a {1, 224, 224, 3} input, but yours might be {1, 299, 299 ,3}, which led to error.

    One of the methods to fix it is to change your input dimension, that is you raw picture dimension.

      cd $SNPE_ROOT/models/inception_v3/scripts
      python create_inceptionv3_raws.py -s 224 -i input_pictures/ -d output_pictures_224
      ## input_pictures contains your picture files, i.e. monkey.jpg
      ## output_pictures_224 contains entire input files and output *.raw files with 224*224
  • batch size mismatch

    [INFO] InitializeStderr: DebugLog initialized.
    [INFO] Writing intermediate model
    [WARNING] NetworkTopology::populateNetworkDesc network desc inputs is empty. Does this network have data/input layer?
    [ERROR] Input file <raw file path> has inappropriate number of bytes (602112) for non-NV21 input for batch 1 and size {10, 224, 224, 3}.
    Expected bytes 6021120
    [INFO] DebugLog shutting down.

    That's because the model require a "batchsize = 10" input, and your batchsize is 1. The simple way is to cascade raw files to meet the requirement. One possible way is as below.

      cat A.raw B.raw C.raw > D.raw
      ## to get a D.raw with batchsize = 3

After getting new raw files and updating image_file_list.txt, you should be able to quantize the dlc successfully.