Parallel variant with power tuning
Optimizing power consumption
To summarize, parallelization of the algorithm for finding all primes under 10,000,000 results in vastly shorter processing time but consumes unacceptably higher power:
The previous variant establishes the highest level of performance attainable by parallelizing the algorithm. Now, to tune power consumption, the Qualcomm® Snapdragon™ Power Optimization SDK includes the static API request_mode. Here, the API is invoked to run the big and LITTLE clusters at 0-15 percent of maximum CPU frequency. Setting CPU frequency to that range establishes the highest level of power saving attainable while still parallelizing the algorithm.
In that code snippet, the request occurs just before the pfor_each pattern, when CPU cores are running at the lowest possible frequency.
Snapdragon Profiler shows that, as before, all eight CPU cores are engaged in parallel:
Similarly, CPU Utilization is at 100 percent on all cores when the algorithm is running:
And, CPU Frequency is at a bare minimum for both LITTLE and big clusters:
Finally, the CPU has consumed 82 mW, which is less than with the sequential variant and far less than with the parallel variant. However, processing time has jumped back up to 26 seconds: