Forums - Significant Memory Usage Increase in QNN after graph finalization on Snapdragon 8gen3 HTP Backend

1 post / 0 new

Significant Memory Usage Increase in QNN after graph finalization on Snapdragon 8gen3 HTP Backend

1444943165

Join Date: 25 Nov 20

Posts: 4

Posted: Wed, 2024-03-27 09:06

Hi, I am now running neutral network utilizing QNN on Snapdragon 8gen3 HTP backend. However, I have encountered a puzzling issue regarding memory usage that I hope you might shed some light on.

Specifically, after finalizing the graph, I noticed a significant increase in memory usage, about 3x. For instance, when I added an INT8 Conv2D layer to implement FullyConnected, e.g., 1, 1, 1, 2048 for inputs and 1,1, 2048, 2048 for weights, the memory consumption should be only 4MB. However, surprisingly, the memory usage of the finalized graph was 12MB. A similar trend for 1, 1, 4096, 4096 weights was observed.

Here is the log for adding three 1,1, 2048, 2048 weights Conv2D ops. The number in () is virtual memory usage.

Memory Usage: 384 MB(3671)
Memory Usage: 388 MB(3671) after add: model.layers.0.ln1
Memory Usage: 393 MB(3671) after add: model.layers.0.ln2
Memory Usage: 397 MB(3671) after add: model.layers.0.ln3
Memory Usage: 397 MB(3671) at: before graph finilize
Memory Usage: 435 MB(3685) at: after graph finilize

As evident from the log, although each Conv2D operation should account for 4MB, the finalized graph exhibits an additional 38MB memory usage, nearly three times the expected value.

I wonder whether this is attributable to a memory bug for QNN or QNN graph finalization introduces some specific optimizations that inadvertently affect memory allocation. Given my intention to execute large neural networks, this threefold increase in memory usage poses a significant challenge, potentially leading to out-of-memory errors.

Any insights or assistance you could provide in resolving this issue would be immensely appreciated.

Looking forward to your prompt response!

Thanks.

Forum vote up/down

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Sort By

Filter Results