Forums - How to run nms function with gpu to speed up time on RB5 dev kit. After inference using model_quant.dcl file

2 posts / 0 new
Last post
How to run nms function with gpu to speed up time on RB5 dev kit. After inference using model_quant.dcl file
nguy3nt4n99
Join Date: 14 Mar 24
Posts: 2
Posted: Sat, 2024-04-27 22:45

Hi everybody,

I was able to run yolov8n.dcl based on the code at https://github.com/quic/sample-apps-for-robotics-platforms/tree/master/R...
However, after completing the inference, I have to run the nms function using the CPU, which takes a lot of time. Is there a way to run this nms function using GPU?
I translated the original nms function to cpp code with the torch-cpu library


Thanks,
Tan

  • Up0
  • Down0
jesustotten735
Join Date: 27 Jun 24
Posts: 1
Posted: Thu, 2024-06-27 21:46

Quote:
Hi everybody,

I was able to run yolov8n.dcl based on the code at https://github.com/quic/sample-apps-for-robotics-platforms/tree/master/R...merge fruit
However, after completing the inference, I have to run the nms function using the CPU, which takes a lot of time. Is there a way to run this nms function using GPU?
I translated the original nms function to cpp code with the torch-cpu library
yolov8's original nms function at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/utils/o...

Thanks,
Tan

I think you should modify your code to use the GPU version of the NMS function. You can utilize the torchvision.ops.nms function, which provides an optimized GPU implementation of NMS. Import the necessary packages:

python
import torch
from torchvision.ops import nms
 
Then, replace the relevant part of your code where you perform NMS with the following lines:
python
# Assuming `boxes` and `scores` are the bounding boxes and corresponding scores, respectively
keep = nms(boxes, scores, iou_threshold)
boxes = boxes[keep]
scores = scores[keep]
 
Note: Make sure that the boxes and scores tensors are already on the GPU. You can move them to the GPU using the .to(device) method, where device is the CUDA device you want to use.
  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.