Artificial intelligence (AI) and machine learning (ML) are evolving fast, with headline-grabbing changes in games, human conversations, social media and even fuel pumps. And when you step back and look at AI through the eyes of a developer, you can see a carrot-and-stick scenario quickly evolving as well:
- The Carrot: The two main components of ML - training and inference - have until recently been relegated to the cloud. The increasing compute power of mobile processors is setting the stage for running inference workloads on edge devices like smartphones and drones instead of running them in the cloud.
- The Stick: The most compelling applications in today's world are defined by their user experiences. While applying styles to selfies is engaging and fun, the latency involved in performing that work in the cloud will ruin a live experience. If you’re not running workloads like classification and tracking on the device, you’ll annoy your customers and eventually lose them.
- The Rider: Each core on a mobile processor - CPU, GPU and DSP - has its own power/performance profile. As a developer, you can choose where and how to run your workloads most efficiently on an edge device like a smartphone.
That’s why Qualcomm Technologies, Inc. (QTI) developed the Qualcomm® Snapdragon™ Neural Processing Engine (NPE) SDK, which has been in limited release for a few months and is now available to the broader developer community via the Qualcomm Developer Network. It’s designed to accelerate neural network processing on Snapdragon devices and to allow you, the developer, to easily choose the optimal core for your specific user experience: Qualcomm Kryo™ CPU, Qualcomm Adreno™ GPU or Qualcomm Hexagon™ DSP.
AI with Higher Performance and Lower Power Consumption
On any device at the edge, AI is a tricky balance between performance and power consumption.
Say you’ve trained your neural network model in the cloud. The next step is to write your mobile application to run inference against that model to recognize faces, track objects, detect voices, understand language and even add cats to selfies. You know how latency will ruin the UX if you run everything in the cloud, so you write your app to run inference workloads on the CPU of the device. But soon you realize that you’re getting below-average performance at the cost of above-average battery drain.
“I think I’ll try a different core instead of the CPU,” you say.
Good idea. The CPU, GPU and DSP on the Snapdragon processor handle workloads differently from one another. A speech detection application might run with the best power/performance profile on the Hexagon DSP, while an object detection or style transfer application might be better suited to the Adreno GPU.
We’ve designed the NPE SDK to let you evaluate how efficiently each core executes your models. You then decide where to program your app to run them.
Improvement on Neural Network Models Trained in Caffe/Caffe2 and TensorFlow
QTI has focused on two open-source deep learning frameworks: Caffe/Caffe2 and TensorFlow, each developed and sponsored by Facebook and Google respectively. And, if you have proprietary or custom network layers not currently supported in Caffe or TensorFlow, the NPE SDK gives you the flexibility to add them.
What kind of boost comes from offloading the inference workload to the right core? On commercial handsets, our tests show a 4-5x improvement in performance and energy efficiency on the Adreno GPU, and up to another 2x improvement on the Hexagon DSP using vector extensions (HVX).
Developers are already using NPE to harness the power of Snapdragon for their mobile app experiences. Facebook, for example, announced at their developer conference, F8, that they would be integrating the NPE into the camera of the Facebook app to accelerate Caffe2-powered AR features creating more fluid, seamless and realistic applications of AR in photos and live videos.
The NPE SDK includes runtime software, libraries, APIs, offline model conversion tools, sample code, documentation, and debugging and benchmarking tools. The NPE SDK is currently compatible with Snapdragon 820, 835, 625, 626, 650, 652, 653 and 660 (Android) and Snapdragon 625 and 626 (Linux).
If you have the resources and know-how to train neural network models, and if you want optimal performance and power on mobile or edge-based devices powered by Snapdragon, then the NPE SDK is for you. You don’t need to know about heterogeneous computing or be an expert in ML, although it helps to have some experience in data, analytics and the training of deep neural networks.
Look for future posts from me and the Qualcomm AI team with more details on the NPE SDK:
- well-known applications that already use it
- ways you can get started with it
- specifics on improvement in performance and energy efficiency
Meanwhile, visit our Snapdragon Neural Processing Engine page to download the SDK today.
You can also register to attend our webinar, Snapdragon and AI at the edge, happening August 1, 2017 at 9 AM PST to learn more about why QTI believes that mobile is the perfect platform to drive AI experiences on device.