How can I improve the performance of my application? How can I get more out of my hardware?
If limited processing power, energy management and thermal issues are important to your product’s performance, learning about heterogeneous computing could be the answer to your questions.
When we talk about heterogeneous computing, we’re talking about how the separate processing ‘blocks’ of the Qualcomm® Snapdragon™ system-on-a-chip (SOC) can be used to effectively handle the work demanded by today’s applications. We like talking about it because it can really help to get the best out of Snapdragon processors.
How does this work? Say you have a program that executes many simple operations, like multiplying vectors of data. Handing the computations to a processor optimized for data-parallelism (like a GPU) makes sense: the GPU is streamlined for executing simple instructions over lots of data while keeping power efficient. On the other hand, giving the same task to a general purpose, complex instruction processor like the CPU will consume much greater power and often may not reach the same performance. This also holds true in the other direction: assigning programs having a lot of branching logic, pointer chasing and manipulation will be inefficient on a processor optimized around a simple processing model, but the same program will perform terrific on a CPU.
By analogy, imagine a busy restaurant. Think of the CPU as the head chef, the GPU as the Sous-chef, and the DSP as the prep cook. When an order comes in, the head chef doesn’t do everything – because all the diners could not be served effectively. The chef distributes the work to the specialists who excel at completing their tasks with great efficiency, improving the overall ability of the kitchen to meet the demands of its patrons.
That’s the basic principle behind heterogeneous computing: the CPU shouldn’t overwork trying to handle every task consecutively when jobs can be done more efficiently by the specialized processing blocks of the SOC in parallel.
Given that today’s mobile, VR and IoT applications combine highly specialized computational operations with complicated processing instructions as part of the overall application experience, what developers need is a system for heterogeneous computing built to let them leverage the best processor for the many tasks making up their application.
Snapdragon processors are designed from the ground up to support heterogeneous computing, which gives you control over the distribution of processing within the SOC. By choosing how to spread out tasks to the different processing blocks, and using the associated SDKs to fine tune those sub-components, you can optimize your application’s performance at the hardware level.
So what are the ‘processing blocks’ in the Snapdragon dedicated to different processing tasks?
First, we have the Qualcomm Kryo™ CPU. This is the workhorse and taskmaster, doing the bulk of regular processing work. Complicated executional logic, general purpose instructions are ideally carried out here.
Second, we have the Qualcomm Adreno™ GPU. This is the block which is best suited to processing graphics, as well as the complex computations involved with machine learning and AI. GPUs excel at performing similar, computations on very large quantities of data.
Third, there is the Qualcomm Hexagon™ DSP. The DSP is best suited to process digital signals from the outside world, like those generated by a smartphone camera and microphone.
Using these heterogeneous processors within Snapdragon processors can help to achieve better performance while minimizing power consumption and thermal issues.
Supported by the software resources such as SDKs and profilers available on QDN, you can use heterogeneous computing techniques to get your application performance really cooking (but not overheating)!
We think heterogeneous computing techniques are exciting, because it paves the way for boundary pushing mobile applications – from virtual and augmented reality to machine learning applications and IoT.
With the next blog in this series, we will take a longer look at the tools involved. The third part will get into using these tools to achieve advanced optimizations! See you then!