In contrast to Qualcomm Neural Processing SDK (which can accelerate a dlc model converted from TF, Caffe, Caffe2 or Onnx), QRB5165 has support to accelerate TFLite models on Hexagon DSPs, GPU, and CPU via NNAPI. Although NNAPI from Google is specific to Android, the NN framework (along with NNHAL 1.2) has been ported from Android to run on QRB5165 with extension of support to run models on Hexagon DSPs.

The workflow for this functionality is similar to that of Android: a model is trained in tensorflow to meet the criteria, then frozen and converted to tflite (using TFLite converter). The model is then provided to the NNAPI runtime, which now has the capability to offload models to the Hexagon DSP. The GStreamer TFLite plugin can be used to exercise TF Lite use cases. The postprocessing result is attached as machine learning metadata (MLMeta) to the GStreamer buffer. The model can be accelerated by developing an application directly by using NNAPI. TFLite and NNAPI code samples are provided on github (coming soon!!!).

This process is illustrated in the following diagram.

In the diagram, upon GStreamer launch, inference frames from either a camera source (YUV) or a file source are delivered to the GStreamer TF sink (TFLITE GStreamer plugin). The TF Lite runtime can be running on the DSP, CPU, or GPU. The model(s) are appropriated for TF on a seperate host. Inference results are gathered back in the GStreamer TF sink for postprocessing (for example. overlaying bounding boxes and class IDs on a detected object in frame if the model is an object detection model).

The following example shows the use of inferencing via a live camera 1080P stream with a bounding box overlay applied inline on a YUV stream. The render is on weston display. Hence, the set of XDG_RUNTIME_DIR. Push the corresponding labels and models in the folders you refer to in the labels and config variables respectively. 

export XDG_RUNTIME_DIR=/usr/bin/weston_socket && gst-launch-1.0 qtiqmmfsrc ! video/x-raw, format=NV12, width=1280, height=720, framerate=30/1, camera=0 ! qtimletflite config=/data/misc/camera/mle_tflite.config model=/data/misc/camera/detect.tflite labels=/data/misc/camera/labelmap.txt postprocessing=detection ! queue ! qtioverlay ! waylandsink x=960 y=0 width=960 height=540 async=true sync=false enable-last-sample=false