Forums - Inference timing issue in simulation mode during Quantisation

2 posts / 0 new

or Register

Last post

Inference timing issue in simulation mode during Quantisation

anant.phatak

Join Date: 20 Apr 22

Posts: 7

Posted: Mon, 2022-05-23 23:52

Top

We are running QAic-exec in simulation mode.

When I perform the inference for the FP32 precision we are getting inference time of 191 sec for one image.

Once we set flags for INT8 quantisation we are getting inference time of 26 min per image. we have tried both static and dyanamic quantisation but we are getting similar timing for both the methods.

We have followed the following steps

1.Generate the profile for the model.

./qaic-exec -m=<ONNX model path> -input-list-file=<text file path> -dump-profile=<path to dump profile(.yaml file)>

2.Inference using static quantization.

./qaic-exec -m=<ONNX model path> -input-list-file=<text file path> -load-profile=<path to load profile(.yaml file)> -write-output-dir=<path to store the outputs> -quantization-precision-bias=Int8 -quantization-precision=Int8 -quantization-schema=symmetric_with_uint8

we observed that inference time for int8 quantization is way more than that taken by FP32 in simulator mode. Is this expected behaviour of QAic-exec in simulator mode?

If no, Please let us know the suitable command line arguments for the execution in simulator.

Forum vote up/down

Re: Inference timing issue in simulation mode during... #1

weihuan

Join Date: 12 Apr 20

Posts: 270

Posted: Sun, 2022-06-12 04:53

Top

Dear customer,

The inference time is better if you input fixed tensor compared with float tensor, as float data needs more time to quantize to fixed points.

BR.

Wei

or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Sort By

Filter Results