some quentions about quantization
Posted: Tue, 2021-11-09 23:17

What means about output encoding?


Posted: Tue, 2021-11-16 08:47

Quantisation is one of the way to reduce the AI computation demands and increase power efficiency. It is an umbrella term that covers a lot of different techniques to convert input values from a large set to output values in a smaller set. Coming to SNPE, non-quantized DLC files use 32 bit floating point representations of network parameters whereas Quantized DLC files use fixed point representations of network parameters, generally 8 bit weights and 8 or 32bit biases.
By default snpe-xxx-to-dlc tool convert original model into non quantised model. 
To quantize the model to 8 bit fixed point, use snpe-xxx-quantize (xxx is caffe, caffe2, onnx, tf).
The Quantisation algorithm takes the set of floating point values as input and outputs corresponding set of 8-bit fixed point values and encoding parameters (
encoding-min and encoding-max). These output encoding parameters define the range and floating point values that will be representable by the fixed point format.
encoding-min: specifies the smallest floating point value that will be represented by the fixed point value of 0
encoding-max: specifies the largest floating point value that will be represented by the fixed point value of 255

For more information on Quantisation refer the following links:

