AI hardware cores/accelerators
Qualcomm® Innovators Development Kit supports running the AI/ML models on the following three hardware Cores/Accelerators:
- Qualcomm® Hexagon™ Tensor Processor (HTP) - The Qualcomm HTP is an AI Accelerator that is suited for running computationally intensive AI workloads. To get improved performance and run AI/ML model on HTP, model must be quantized to one of the supported precisions:INT4, INT8, INT16 or FP16.
- Qualcomm® Adreno™ GPU The GPUs can be used to run unquantized FP32/FP16 models with a higher throughput compared to CPU. The GPU can also be used for running UDOs implemented using OpenCL.
- Qualcomm® Kryo™ CPU The CPU supports unquantized models with FP32 precision. CPUs can be used to run the UDOs or ops which are not optimized for execution on HTP. It can also be used for model benchmarking purposes. Below table summarizing the properties of hardware/accelerators available on Snapdragon for executing AI/ML Models:
Accelerator | Supported Data Types | Quantization/ Activation | Power | Throughput | Features |
HTP | INT4, INT8, INT16, FP16 | Needed | Low | High |
|
GPU | INT8, INT16, FP16, FP32 | Not needed | Medium | Medium |
|
CPU | INT8, INT16, FP16, FP32 | Not needed | High | Low |
|
Datatype | Details |
INT4 | 4-bit weights + 8-bit activations |
INT8 | 8-bit weights + 8-bit activations |
INT16 | 8-bit weights + 16-bit activations |
FP16 | 16-bit floating point precession |
FP32 | 32-bot floating point precision |
Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.