I am having a very weird issue with OpenCL on the Adreno GPU of a Snapdragon 845. This is runnning on an Ubuntu 16.04 system on the board.
The majority of my company's stack is written in Python with specific algorithms and solver libraries written in C++ and HPC languages such as OpenCL. Generally speaking, the application main loop runs in Python and loads extension libraries (.so files) that contain the native code. I am experiencing a very strange issue where all OpenCL API fns fail when they are loaded and executed from a Python context but behave normally when executed from a native C++ executable. Below is a link to a code sample that can be used to replicate this problem.
The code sample demonstrates the kind of execution that our stack is built on. It consists of a core C++ lib that uses OpenCL. This lib is built with pybind11 and loaded by our Python stack at runtime as an extension module. The lib interface allows for calling C++ code from Python. These C++ functions in turn call the OpenCL API.
The sample code tests the lib by manually linking it to a C++ binary as well as dynamically loading it from Python. In the former case, the pybind11 interface is unused. The behvaiour that we observe is that the native C++ compiled program runs as expected, but when going the Python route, all OpenCL API functions fail. Using GDB I have verfied that the two computation paths utilize the same shared library object and follow the same code path within the library.
Since I do not have visibilty into the OpenCL library, I have no information as to why the call fails when the top level fn is originates from Python.
Would somebody be able to help shed some light on this issue?