I am confused about model quantization and inputs. Suppose I have an ONNX network that expects 256x256 inputs as 32bit floats. Then I convert this network to a quantized dlc format. What is the expected input format? Would the quantized model expect 256x256 8bit values, or does SNPE perform input quantization from 32bit floats to 8bit to run on AIP/DSP?

Quantization means the network parameters of the model are made to 8-bit floating-point representation. So, this does not have any effect on the input provided to the model. But as seen from the documentation the snpe-onnx-to-dlc just converts the onnx model to a dlc model and needs to use snpe-dlc-quantize for quantizing the model. The input provided to the model needs to have a batch size as 1 (1,256,256,3 in your case if it is a color image).

