Comparisons and conclusions

Balancing along a spectrum

The results of the tutorial appear below:

Comparisons Image 1

Parallelization alone slashed processing time, but at the cost of considerably more CPU power consumption. Parallelization with power tuning, on the other hand, improved both processing time and CPU power consumption acceptably.

The exercise demonstrates that, even though developers cannot escape the trade-off between performance and power consumption, the SDKs provide APIs for finding the optimal point on the spectrum.

Another strategy for power saving is to choose the right compute device — CPU, GPU, DSP — for execution (keeping in mind that the Snapdragon® Power Optimization SDK cannot be used to modify power consumption of DSP). If, for example, the application involves a lot of computation, it is better to schedule that work in the big cluster and take advantage of high performance. Or, if the application processes images, then the GPU is more appropriate because of its inherent parallelism. And, workloads like inference of fixed-point machine learning models run more efficiently and consume far less power on DSP. Choosing the right device affects not only performance but also power consumption.


Following are the three main steps recommended for balancing power and performance:

  1. Extract as much parallelism as possible from the heterogeneous nature of the Snapdragon mobile platform. Features such as affinity, patterns, tasks and buffers in the Snapdragon Heterogeneous Compute SDK are designed for that.
  2. Control placement of algorithm execution on the correct device. On CPU, use the affinity feature to control whether the program construct should run on the big or LITTLE cluster. Or, use the task framework to schedule the work on CPU/GPU/DSP.
  3. Once code is properly modified for parallelism, the app will achieve better performance than with sequential processing. Then, as in this case study, use the Snapdragon Power Optimization SDK to send the system further hints for controlling core frequencies on CPU and GPU.

Those steps lead to improvements in both power consumption and performance in the application.

Your turn

Get started parallelizing and tuning your own apps with the tools used in this tutorial: