Buffers

Heterogeneous memory management

Giving hints to the runtime

One difficulty in parallel programming with different devices (CPU, GPU, DSP) is how to manage memory among them while providing a seamless interface to the app developer.

Buffers are array-like data stores for any user-defined data-type. It is possible to use the buffer abstraction for any of the specialized memories like OpenGL, OpenCL memories and textures, and ION. All buffers are accessible by the CPU, GPU and DSP. The APIs available as part of the buffer component give further hints to the Heterogeneous Compute runtime about where to allocate that specific memory.

Buffers

Knowing in advance that a particular task needs to run across CPU and DSP, for example, a developer could give that hint to the Qualcomm® Snapdragon™ Heterogeneous Compute runtime. The runtime would then try to allocate an ION memory instead of a system memory to avoid data copy when moving data between the devices.

The APIs help in moving and synchronizing the data across compute cores efficiently, with as little data copy as possible.

Buffer APIs

Following are APIs available as part of the buffer component:

Buffer APIs

create_buffer is a templatized API that could be based on any user-defined data-type. It creates a data store.

For the application to synchronize data between the host and the device, the application can acquire the buffer in read-only, write-invalidate or read/write access.

Code sample

The following code sample extends the previous ones to create a buffer with vector_double.

First, instead of vector_double taking two vectors, the highlighted code uses the create_buffer API to create two Snapdragon Heterogeneous Compute buffers, b1 and b2, of 10 integers each. Buffers and vectors look syntactically similar, to mimic the structure of C++.

Code Sample 1

Next, acquire_wi acquires the buffer from the application perspective in a write-invalidate access mode.

Code Sample 2

Data goes into the b1 buffer and then the buffer is released. It becomes the data for any task created from this point forward.

Then, create_task binds the CPU kernel created earlier and buffers b1 and b2 to the task.

Code Sample 3

Finally, as before, launch launches the task into the Snapdragon Heterogeneous Compute runtime and wait_for awaits completion of the task.

Code Sample 4

Next steps

That is an overview of the Snapdragon Heterogeneous Compute SDK, its components and how to use them. The main intent of the SDK is to help developers achieve better performance and take advantage of the heterogeneous nature of the Snapdragon chipset.

Naturally, the performance increase often comes with an impact on power consumption and, therefore, battery life.

So, how can developers optimize power consumption in apps that use heterogeneous computing? By using the Snapdragon Power Optimization SDK.

Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.