Forums - Mixed Precision Inference on HTP

3 posts / 0 new

or Register

Last post

Mixed Precision Inference on HTP

rajupalem.r

Join Date: 6 Nov 22

Posts: 2

Posted: Mon, 2023-01-16 21:39

Top

Hi,

We are trying to run inference on SM8450's HTP processor using mixed precision.

By mixed precision, we meant few layers have to run in int16/8 and few layers in fp16 (because SM8450 HTP supports fp16 inference).

So wanted to know if this scenario is supported or not on HTP. If not HTP, does any other delegate like GPU or CPU in SM8450 supports this mixed-precision inference.

If above scenario is not supported, We are also thinking of creating an User-defined operation (UDO) for few operations that we wanted to run on fp32. Rest of the operations in network can run in int16 using the default implementation.

Any thoughts/opinions on this idea?

SNPE version: 1.68

Hexagon SDK version: 4.0.0

Regards,

Pratesh

Forum vote up/down

Re: Mixed Precision Inference on HTP #1

sanjjey.a.sanjjey

Join Date: 17 May 22

Posts: 67

Posted: Tue, 2023-01-17 06:14

Top

Hi,

The method for processing floating point inputs and outputs on HTP target changed.Specify --use_float_io parameter to the quantizer for offline preparing or the --buffer_data_type argument to the runtime for the greatest possible performance.

For CPU & GPU Runtime on Quantized model will be Dequantized by the Runtime, Increasing network Initialization time.Accuracy maybe impacted, and for Non-Quantized model, the model is native format for this CPU & GPU runtime. Model can be directly passed to the runtime.Maybe more accurate than the Quantized model.Non-Quantized model use Floating point Representations of network parameters.

Thanks.

Re: Mixed Precision Inference on HTP #2

rajupalem.r

Join Date: 6 Nov 22

Posts: 2

Posted: Sun, 2023-02-05 23:20

Top

Thanks for the reponse. But this doesn't exactly answer above query.

Anyhow, recently I came across Quantization overrides parameter in snpe-tensorflow-to-dlc tool. So does SNPE suport mixed-precision using this overrides option. Can I override the default Quantiation precision and modify some particular layers to run in Fp16.

BR.

or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Sort By

Filter Results