Forums - OpenCL Work group Size varies with complexity of kernel

1 post / 0 new
OpenCL Work group Size varies with complexity of kernel
ravibanger
Join Date: 8 Feb 16
Posts: 2
Posted: Fri, 2016-09-02 02:23

I have an OpenCL kernel, which is fairly well optimized. But I am facing the following problem: As the complexity of the kernel or the local memory increases the max work group size also decreases. I believe that my kernel would perform further well if I am able to increase the work group size. The total local memory allocated is less than 10K bytes, which is very less compared to the 32K limit of the Adreno 530 hardware limit. For 10K local memory I am able to launch only 64 threads(Max possible being 1024). Storing the OpenCL image binary also does not produce readable code.  

I am looking for some directions beyond this for further enhancing the performance. Like can I increase the performance by somehow enabling launching the kernel with larger work group size. Probably by looking at the generated readable binary code would help me. (How do I do this? )

 

 

  • Up0
  • Down0

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.