Processors like the Qualcomm® Kryo™ 385 CPU integrated in the Qualcomm Snapdragon™ 845 Mobile Platform are based on Arm’s big.LITTLE architecture, which consists of “big” high performance cores for process intensive tasks, and “little” power efficient cores for less intensive tasks. The goal is to provide a flexible package for mobile devices that dynamically balances processing demands with power efficiency both at the operating system and application levels.
So with such flexibility available, we wanted to highlight a few key aspects of big.LITTLE to consider when you develop your mobile solutions. How can this type of architecture impact the efficiency of your applications?
Know your Core Strengths
Traditional frequency, voltage regulation and scaling of cores have gone a long way to improve power efficiency. But now having different types of cores at your disposal opens new possibilities for tuning power efficiency as demands change throughout the application’s runtime. Knowing there are different types of cores at your disposal for mobile development is important because it gives you several choices to think about as you architect your application.
For example, you can write routines that will always execute on one type of core or another, or you can write routines that can migrate between the two types. Whether that is automatically through the operating systems task scheduler, or manually through controls provided by the SDK.
A key aspect is to identify up front which type(s) of core(s) your various routines will run on or require. As a general rule of thumb, a good approach is to run on little cores if possible because that can reduce both heat and power consumption. In planning for this, categorize tasks as important/unimportant, high/low priority, short lived vs long running, etc. and then decide how to best allocate them to the most appropriate processor.
For example, if you know that certain code such as rendering and artificial intelligence must execute at 60fps, such high priority routines probably need to be permanently allocated to the big cores. Conversely, background threads such as sending and listening for network information can probably be allocated to little cores.
Understand the Role of Operating System and API
The operating system (OS), and most notably its task scheduler, plays a key role in coordinating tasks on a big.LITTLE platform. The task scheduler usually has key knowledge and sophisticated analytics about the demands being made by each task (e.g., threads). By leveraging this information, the task scheduler can allocate or migrate tasks to specific cores. In some implementations it can idle unused cores, which is something to watch for when profiling your application for power efficiency.
A key purpose of the task scheduler in big.LITTLE is to perform run-state migration, which is the process of selecting between big and little cores for tasks. There are three general approaches that task schedulers use:
- Core clustering: cores of the same size are treated as one cluster, and then the most appropriate cluster is chosen based on demands of the system.
- In-kernel switching: a big and a little core are paired into a virtual core in which only one of the two physical cores in that virtual core is used depending on demands.
- Global task scheduling: all physical cores are available all of the time, and global task scheduler allocates tasks on a per-core basis depending on demands.
It’s a good idea to try and identify how a given operating system handles run-state migration on a given hardware platform. Check out which architectures are available on your target platform, as well as the operating system you intend to build on and how it handles run-state migration. Knowing this can provide at least some insight as to how run-state migration on that platform works as a whole, and how much benefit it might offer to the applications you plan to write.
How much Control is Exposed?
Depending on the manufacturer, the platform may hide big and little core selection so that you don’t even need to think about it in your development, or they may provide varying degrees of control through APIs. For example, the Qualcomm Snapdragon Power Optimization SDK exposes two main modes through its API that allow developers to control power efficiency:
- “static mode”: controls the performance/power balance based on the current state of the application. This mode offers a number of flavors such as Window Mode, which configures the minimum and maximum frequency percentages that cores can use.
- “dynamic mode”: adjusts the core usage and frequencies through a feedback loop that self-regulates the system to a desired throughput metric.
For programmers less inclined to deal with the performance/power balance, many platforms provide out-of-the box core management configurations that produce very good results for a broad range of application types.
Don’t Forget the Tools
Finally, don’t forget that at some point you’ll need to optimize and debug your application. So to help you understand the inner workings and optimization of big.LITTLE as your application executes, your SDK should include quality development tools. For example, Qualcomm Snapdragon Profiler can provide in-depth visual analytics on how your code is performing on the different cores of the Kryo 385.
In your development IDE, you should also look for features that facilitate easy debugging across both cores and different processors. The debugger should make it easy to halt execution in multi-threaded applications while you analyze the state of things.
The idea behind big.LITTLE seems like a no-brainer because it can provide direct benefits to increasing the efficiency of the application you are developing. While Qualcomm Developer Network has been providing optimization tools for our Snapdragon Mobile Platform for several years, now is a great time to start experimenting with the right combination of SDKs with big.LITTLE to help optimize your development.