Hi,
When implementing custom layers using UDO framework, the GPU data can be accessed in the following way as shown in the UDO Softmax example
SnpeUdo_GpuTensorData_t* tensorData0 = static_cast<SnpeUdo_GpuTensorData_t*>(inputs[0].tensorData);
And the above variable is used to set the Kernel Arg as follows
err = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)tensorData0->mem);
tensorData0->mem points to a cl_mem object.
However when using clEnqueueReadBuffer / clEnqueueWriteBuffer to the above cl_mem object throws up an error and even the operation is complete, the values returned do not look valid (lots of 0 with some massive outliers 10e14 / 10e-14)
Is there a way to access the data in the GPU buffers ?
For example, using the flag --debug when running snpe-net-run will dump the output of each layers. So it is possible to access this data. How to do it ?
Dear customer,
You can specific the exeuction options with detailed for profiling_level and the option of __debug to enable all layer output in qnn-net-run. That method will get all the layer consumption after model exuected.
BR.
Wei
Hi Wei,
I am using debug to get the layer outputs but I want to know if I want to access the layer output within the code itself.
Question:
1. Can I run multiple kernels within the same layer ?
For e.g. Run kernel 1 using inputs and write to a temporary cl_mem buffer. Then run kernel 2 using this temporary buffer as input and run kernel 2 and write to output buffer
2. Can I access the layer input and output data ?
When I use clEnqueueReadBuffer and clEnqueueWriteBuffer on the input / output buffer, I get errors / segfaults. Is there some limitation or something else to be done from my side