Parallel variant with power tuning

Optimizing power consumption

To summarize, parallelization of the algorithm for finding all primes under 10,000,000 results in vastly shorter processing time but consumes unacceptably higher power:

Parrallel Variant With Power Tuning Image 1

The previous variant establishes the highest level of performance attainable by parallelizing the algorithm. Now, to tune power consumption, the Snapdragon® Power Optimization SDK includes the static API request_mode. Here, the API is invoked to run the big and LITTLE clusters at 0-15 percent of maximum CPU frequency. Setting CPU frequency to that range establishes the highest level of power saving attainable while still parallelizing the algorithm.

Parrallel Variant With Power Tuning Image 2

In that code snippet, the request occurs just before the pfor_each pattern, when CPU cores are running at the lowest possible frequency.

Snapdragon Profiler shows that, as before, all eight CPU cores are engaged in parallel:

Parrallel Variant With Power Tuning Image 3

Similarly, CPU Utilization is at 100 percent on all cores when the algorithm is running:

Parrallel Variant With Power Tuning Image 4

And, CPU Frequency is at a bare minimum for both LITTLE and big clusters:

Parrallel Variant With Power Tuning Image 5

Finally, the CPU has consumed 82 mW, which is less than with the sequential variant and far less than with the parallel variant. However, processing time has jumped back up to 26 seconds:

Parrallel Variant With Power Tuning Image 6

Next: Comparisons and conclusions