Forums - Accessing GPU memory in UDO

3 posts / 0 new

or Register

Last post

Accessing GPU memory in UDO

anirudhm84

Join Date: 13 Jun 17

Posts: 8

Posted: Thu, 2021-12-30 00:13

Top

Hi,

When implementing custom layers using UDO framework, the GPU data can be accessed in the following way as shown in the UDO Softmax example

SnpeUdo_GpuTensorData_t* tensorData0 = static_cast<SnpeUdo_GpuTensorData_t*>(inputs[0].tensorData);

And the above variable is used to set the Kernel Arg as follows

err = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)tensorData0->mem);

tensorData0->mem points to a cl_mem object.

However when using clEnqueueReadBuffer / clEnqueueWriteBuffer to the above cl_mem object throws up an error and even the operation is complete, the values returned do not look valid (lots of 0 with some massive outliers 10e14 / 10e-14)

Is there a way to access the data in the GPU buffers ?

For example, using the flag --debug when running snpe-net-run will dump the output of each layers. So it is possible to access this data. How to do it ?

Forum vote up/down

Re: Accessing GPU memory in UDO #1

weihuan

Join Date: 12 Apr 20

Posts: 270

Posted: Thu, 2021-12-30 17:48

Top

Dear customer,

You can specific the exeuction options with detailed for profiling_level and the option of __debug to enable all layer output in qnn-net-run. That method will get all the layer consumption after model exuected.

BR.

Wei

Re: Accessing GPU memory in UDO #2

anirudhm84

Join Date: 13 Jun 17

Posts: 8

Posted: Mon, 2022-01-03 01:33

Top

Hi Wei,

I am using debug to get the layer outputs but I want to know if I want to access the layer output within the code itself.

Question:

1. Can I run multiple kernels within the same layer ?

For e.g. Run kernel 1 using inputs and write to a temporary cl_mem buffer. Then run kernel 2 using this temporary buffer as input and run kernel 2 and write to output buffer

2. Can I access the layer input and output data ?

When I use clEnqueueReadBuffer and clEnqueueWriteBuffer on the input / output buffer, I get errors / segfaults. Is there some limitation or something else to be done from my side

or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Sort By

Filter Results