Forums - SNPE 2.16.0 on sm8650 yolov8 failed

4 posts / 0 new
Last post
SNPE 2.16.0 on sm8650 yolov8 failed
lhwywh
Join Date: 3 Dec 23
Posts: 4
Posted: Tue, 2023-12-12 20:53

Hi there. I try to run yolov8m by Snpe2.16 on Xiaomi 14 sm8650(8gen3) but failed. You can get my model on https://github.com/lsCoding666/SNPE-YOLOV8-MODEL/blob/main/model/yolov8m_htp.dlc

 

Code and screenshot on https://github.com/lsCoding666/SNPE-YOLOV8-MODEL/tree/main

 

ONNX Version 1.15

SNPE version 2.16.0

 

The model is quantified.

The problem is the quantified model run on Gpu, It works. But run on Dsp, It failed.

 

Run on DSP: mention line 28

https://github.com/lsCoding666/SNPE-YOLOV8-MODEL/blob/main/run_on_dsp.png

 

Run on GPU

https://github.com/lsCoding666/SNPE-YOLOV8-MODEL/blob/main/run_on_gpu.png

 

And Here is the code

https://github.com/lsCoding666/SNPE-YOLOV8-MODEL/blob/main/Person_Detect...

```

cv::Mat Person_Detect::ProcessImgYoloV8(cv::Mat mat, char *pJstring) {

    img_mat = mat;

       //resize

    std::vector<Detection> output;

    cv::Mat res_img = cv::Mat(640, 640, CV_8UC3);

 

    cv::Mat input_mat;

    im_scale = std::min((float) INPUT_WIDTH / img_mat.cols, (float) INPUT_HEIGHT / img_mat.rows);

 

    int new_w = int(img_mat.cols * im_scale);

    int new_h = int(img_mat.rows * im_scale);

    cv::resize(img_mat, input_mat, cv::Size(new_w, new_h));    //resize

 

    int p_w = INPUT_WIDTH - new_w;

    int p_h = INPUT_WIDTH - new_h;

 

    int top = p_h / 2;

    int bottom = p_h - top;

 

    int left = p_w / 2;

    int right = p_w - left;

 

    cv::copyMakeBorder(input_mat, input_mat,      

                       top, bottom,                

                       left, right,

                       cv::BORDER_CONSTANT,

                       cv::Scalar(114, 114, 114));

 

    //start predict

    zdl::DlSystem::TensorMap output_tensor_map = qc->predict(input_mat);

    zdl::DlSystem::StringList out_tensors = output_tensor_map.getTensorNames();

 

       //put result into out_itensor_map. This is easy to debug

    out_tensors = output_tensor_map.getTensorNames();

    std::map<std::string, std::vector<float>> out_itensor_map;

    for (size_t i = 0; i < out_tensors.size(); i++) {

        zdl::DlSystem::ITensor *out_itensor = output_tensor_map.getTensor(out_tensors.at(i));

        std::vector<float> out_vec{reinterpret_cast<float *>(&(*out_itensor->begin())),

                                   reinterpret_cast<float *>(&(*out_itensor->end()))};

        out_itensor_map.insert(std::make_pair(std::string(out_tensors.at(i)), out_vec));

    }

 

       //put the result into decode_infer

    std::vector<BoxInfo> result;

    zdl::DlSystem::ITensor *out_itensor = output_tensor_map.getTensor(out_tensors.at(0));

    auto boxes = Person_Detect::decode_inferV8(out_itensor->begin().dataPointer(),

                                               {(int) img_mat.cols, (int) img_mat.rows},

                                               left, top,

                                               class_list.size(),

                                               CONFIDENCE_THRESHOLD);

    result.insert(result.begin(), boxes.begin(), boxes.end());

 

       //nms

Person_Detect::nms(result, NMS_THRESHOLD);

 

//draw text and rectangle

    for (int i = 0; i < result.size(); ++i) {

 

        auto detection = result[i];

        __android_log_print(ANDROID_LOG_INFO, LOG_TAG, "tag %d", detection.label);

        __android_log_print(ANDROID_LOG_INFO, LOG_TAG, "tag %f", detection.score);

        cv::Scalar color = cv::Scalar(255, 255, 0);

        cv::rectangle(img_mat, cv::Point(detection.x1, detection.y1),

                      cv::Point(detection.x2, detection.y2),

                      color,2);

        cv::rectangle(img_mat, cv::Point(detection.x1, detection.y1 - 20), cv::Point(detection.x2, detection.y1 ),

                      color,-1);

        std::stringstream ss;

        ss << class_list[detection.label]  << detection.score;

        cv::putText(img_mat, ss.str(), cv::Point(detection.x1, detection.y1),

                    cv::FONT_HERSHEY_COMPLEX, 0.8,

                    cv::Scalar(0, 0, 0), 2);

    }

    std::string str1 = "/storage/emulated/0/testresult/";

    std::string str2 = ".jpg";

    cvtColor(img_mat, img_mat, CV_RGB2BGR);

    cv::imwrite(str1.append(pJstring).append(str2), img_mat);

    pred_out.clear();

    return img_mat;

}

```

 

Here is decode_inferV8:

```

std::vector<BoxInfo>

Person_Detect::decode_inferV8(float *dataSource, const YoloSize &frame_size,

                              int left, int top,

                              int num_classes, float threshold) {

    float *data = dataSource;

    std::vector<BoxInfo> result;

    std::vector<float> confidences;

    std::vector<cv::Rect> boxes;

    for (int i = 0; i < 8400; ++i) {

        std::vector<int> class_ids;

        float maxScore = 0;

        int maxClass = -1;

        for (int cls = 0; cls < num_classes; cls++) {

            float score =

                    data[cls + 4];

            if (score > maxScore) {

                maxScore = score;

                maxClass = cls;

            }

        }

        if (i == 7255){

            int a = 0;

        }

        if (maxScore > threshold) {

            confidences.push_back(maxScore);

            class_ids.push_back(maxClass);

            BoxInfo box;

            float w = data[2];

            float h = data[3];

 

            box.x1 = std::max(0, std::min(frame_size.width,

                                          int((data[0] - w / 2.f - left) / im_scale)));

            box.y1 = std::max(0, std::min(frame_size.height,

                                          int((data[1] - h / 2.f - top) / im_scale)));

            box.x2 = std::max(0, std::min(frame_size.width,

                                          int((data[0] + w / 2.f - left) / im_scale)));

            box.y2 = std::max(0, std::min(frame_size.height,

                                          int((data[1] + h / 2.f - top) / im_scale)));

            box.score = maxScore;

            box.label = maxClass;

 

            result.push_back(box);

        }

        data += 84;

    }

    return result;

}

```

Decode_inferV8 and ProcessImgYoloV8 function is run well on yolov8 by gpu. So decode_inferV8 and ProcessImgYoloV8 should be no bugs.

 

I also try to run yolov5 on Xiaomi14. It works well both gpu and dsp.

 

The model is same but the result is different. It is so stange so I wish to get your help. Thanks

  • Up0
  • Down0
lhwywh
Join Date: 3 Dec 23
Posts: 4
Posted: Tue, 2023-12-12 21:58

By the way, I add gpu and dsp float raw reult to https://github.com/lsCoding666/SNPE-YOLOV8-MODEL/
and you can compare these files, and you can see the position of the person is very close to, but the dsp score is all 0.
https://github.com/lsCoding666/SNPE-YOLOV8-MODEL/blob/main/dsp_gpu_resul...

  • Up0
  • Down0
lhwywh
Join Date: 3 Dec 23
Posts: 4
Posted: Tue, 2023-12-12 23:45

bug fixed.I found the problem is the last step of the model : concat.
Concat the location and score into reult has bug, make scroe all 0
the soultion is delete contcat and the model has 2 output, one is output position and the other is score

  • Up0
  • Down0
lhwywh
Join Date: 3 Dec 23
Posts: 4
Posted: Tue, 2023-12-12 23:45

bug fixed.I found the problem is the last step of the model : concat.
Concat the location and score into reult has bug, make scroe all 0
the soultion is delete contcat and the model has 2 output, one is output position and the other is score

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.