Forums - OpenCL very slow with adreno 650

6 posts / 0 new
Last post
OpenCL very slow with adreno 650
hterrolle
Join Date: 31 Jul 21
Posts: 3
Posted: Wed, 2021-08-11 04:01

hi,

second time that i am trying to post this message.

I bougth a Xiaomi MI10 PRO with spandragon 865 and adreno 650 on Android an i find out that running OPENCL was very slow. Even Slower than my old Hauwei Honor Play with kirin 970 and mali G72 MP12.

I aggre that for OpenGL and Vulkan, 865 is much faster. But for OpenCL is very slow.

So i am wandering is adreno as been designed or not for the use of OpenCL because Kernel execution is very slow.

So my question is why OpenCL is so slow with the Adreno 650.

It is because adreno 650 on mobile phonne as not been designed for the use of OpenCL or because some thing else.

I made some test with the 2 machine if you need more information.

regards herve terrolle (france).

Thanks for the answer. ;))

 

  • Up0
  • Down0
ssaikatsaha1996
Join Date: 22 Oct 20
Posts: 11
Posted: Mon, 2022-09-12 12:18
Same problem found, for OpenCL mali GPU better performance than adreno GPU.. Can you have any way to increase Adreno OpenCL GPU performance?
  • Up0
  • Down0
DuBo
Join Date: 9 Dec 13
Posts: 72
Posted: Tue, 2022-11-29 23:28

Dear Customer

Can you test if below setting can help?
 

In Qualcomm Adreno GPU CL driver, we have below 2 extensions to support CL perf hint and CL priority hint when create CL context.

/*************************************
* cl_qcom_perf_hint extension *
*************************************/
#define CL_PERF_HINT_NONE_QCOM                       0
typedef cl_uint                                     cl_perf_hint;

#define CL_CONTEXT_PERF_HINT_QCOM                   0x40C2

/*cl_perf_hint*/
#define CL_PERF_HINT_HIGH_QCOM                      0x40C3
#define CL_PERF_HINT_NORMAL_QCOM                    0x40C4
#define CL_PERF_HINT_LOW_QCOM                       0x40C5

/*************************************
* cl_qcom_priority_hint extension *
*************************************/
#define CL_PRIORITY_HINT_NONE_QCOM                   0
typedef cl_uint                                     cl_priority_hint;

#define CL_CONTEXT_PRIORITY_HINT_QCOM               0x40C9

/*cl_priority_hint*/
#define CL_PRIORITY_HINT_HIGH_QCOM                  0x40CA
#define CL_PRIORITY_HINT_NORMAL_QCOM                0x40CB
#define CL_PRIORITY_HINT_LOW_QCOM                   0x40CC

Example code:
{
    cl_int nErr = CL_SUCCESS;
    cl_platform_id      m_clPlatformID;
    cl_device_id        m_clDeviceID;
    cl_context          m_clContext;
    cl_context_properties contextPropertyList[5];
    int propIndex = 0;

    <snip>
    // >set context perf hint to high
    contextPropertyList[propIndex++] = CL_CONTEXT_PERF_HINT_QCOM;
    contextPropertyList[propIndex++] = CL_PERF_HINT_HIGH_QCOM;
    // >set context priority hint to low
    contextPropertyList[propIndex++] = CL_CONTEXT_PRIORITY_HINT_QCOM;
    contextPropertyList[propIndex++] = CL_PRIORITY_HINT_LOW_QCOM;
    contextPropertyList[propIndex++] = 0;

    m_clContext = clCreateContext(contextPropertyList, 1, &m_clDeviceID, pfn_notify, NULL, &nErr);
    if(nErr != CL_SUCCESS)
    {
        LOGE("clCreateContext failed with ErrorCode : %d", nErr);
        result = GPP_CL_FAILURE;
        goto CleanUp;
    }
<snip>
}

Meanwhile we also have below CL API to set the perf hint after create context.
cl_int clSetPerfHintQCOM(cl_context context, cl_perf_hint perf_hint);

Example code:
{
    clSetPerfHintQCOM(ctx->ctx, CL_PERF_HINT_HIGH_QCOM);
    // Command Queue related
    ctx->queue = clCreateCommandQueue(ctx->ctx, ctx->deviceID,
                                      kOpenClCommandQueueProperties,
                                      &opencl_err);
}

 

Thanks
Bob Du

  • Up0
  • Down0
ssaikatsaha1996
Join Date: 22 Oct 20
Posts: 11
Posted: Wed, 2022-11-30 07:57
Dear Sir/Madam @DuBo I am on android 12 Adreno 650 Where I can find adreno OpenCL driver? Under Adreno GPU SDK ? Need more information about Adreno OpenCL driver so I can compile myself Thank you Regards Saikat Saha
  • Up0
  • Down0
ssaikatsaha1996
Join Date: 22 Oct 20
Posts: 11
Posted: Thu, 2022-12-01 18:35

LOGE("clCreateContext failed with ErrorCode : %d", nErr); result = GPP_CL_FAILURE; goto CleanUp;

I am getting error with this code 1) error: 'result' undeclared 2) error: expected expression before '<' token 358 | 3) error: 'GPP_CL_FAILURE' undeclared How i solve GPP_CL_FAILURE error?
  • Up0
  • Down0
DuBo
Join Date: 9 Dec 13
Posts: 72
Posted: Thu, 2023-04-20 04:01

Actually opencl/kernel performance is highly related with the various factors especially for local work size, you need to fine tune the local work size to get the best performance.

You also need to consider if there is too much register footprint in your kernel which may cause register spiling to global memory, then cause bad performance.

Thanks
Bob Du

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.