Image Segmentation using DeepLabV3+

Skill LevelArea of FocusOperating SystemSoftware Tools
IntermediateComputer VisionLinuxNeural Processing SDK for AI

The project is designed to utilize the Qualcomm® Neural Processing SDK, which allows you to tune the performance of AI applications running on Snapdragon® mobile platforms. The Qualcomm Neural Processing SDK is used to convert trained models from Caffe, Caffe2, ONNX, TensorFlow to Snapdragon supported format (.dlc format). We further utilize these models to perform semantic segmentation using DeepLab V3 support in the SDK.

The main objective of this project is to develop a machine learning application which can perform selective background manipulation on an image according to the user needs by using architectures such as DeepLabV3. We would be creating an application which can achieve the stated objective on Linux (Ubuntu) using Qualcomm Neural Processing SDK.

Requirements

Software

  1. Ubuntu16.04 machine
  2. Qualcomm Neural Processing SDK setup
  3. Python 3.5

Hardware

  1. Intel core i5 or greater
  2. System RAM with minimum 16 GB
  3. 3. GTX architecture based Graphic card more than 1050Ti (https://www.nvidia.com/en-in/geforce/products/10series/geforce-gtx-1050/)

What we Aim?

To achieve background manipulation (changing the background color) on an image with the help of semantic segmentation using Google’s DeepLabV3 architecture.

The Machine learning model to detect the Objects uses Single Shot Detector (SSD) algorithm trained on MobileNet network architecture.

Why DeepLabV3?

DeepLabV3 uses Atrous spatial pyramid pooling (ASPP) operation at the end of the encoder. This makes the semantic segmentation prediction quality better when compared with other techniques.

DeeplabV3 is used also in the Google Pixel devices for implementing the portrait mode in their camera. In this document we would also discuss how to train a DeeplabV3 model and use it with the help of the Qualcomm Neural Processing SDK for AI.

How to train the model?

Data collection and its pre-processing are the main aspects of Machine Learning. The TensorFlow DeepLab API setup is a prerequisite for training the DeepLabV3.

For installation steps follow the instruction from below mentioned GitHub link, https://github.com/tensorflow/models/tree/master/research/deeplab

After completion of installation, execute below commands to start the training,

  $cd <path to tensorflow setup>/tensorflow/models/research/deeplab
  $ sh local_test.sh 

Note: By default, the iteration count in local_test.sh is 10, if required kindly modify.

How to convert the model into DLC?

Here we use the pre-trained DeepLab model converted to DLC format using the Qualcomm Neural Processing SDK. To use a custom trained DeepLab model, use the instructions provided in Training the model section. To proceed further, make sure the system has the Qualcomm Neural Processing SDK setup available. If not kindly get the setup with the instructions provided in below link.

https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk/getting-started

1. Download & Extract Model

Execute below command to download and extract the pre-trained DeepLab model,

  $wget http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar.gz
  $tar -xzvf deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar.gz

2. Converting the model to DLC format:

The model is pre-trained using the TensorFlow framework and exported to graph file with. pb extension. The snpe-tensorflow-to-dlc tool from the Qualcomm Neural Processing SDK to convert the model to DLC format.

Below are the details of input arguments for snpe-tensorflow-to-dlc tool (the details mentioned below are of DeepLabV3)

  • Input Layer name: sub_7
  • Input Shape: 1, 513, 513, 3 (If using Xception network architecture)
  • Output Layer name: ArgMax

Run below command for converting the model to DLC format which generates deeplabv3.dlc file,

  $ snpe-tensorflow-to-dlc –graph    deeplabv3_mnv2_pascal_train_aug/frozen_inference_graph.pb -i sub_7 1,513,513,3 --out_node ArgMax --dlc deeplabv3.dlc --allow_unconsumed_nodes

Inferencing on Ubuntu using the Qualcomm Neural Processing SDK

The Qualcomm Neural Processing SDK does not support direct images as an input for the model for inferencing. It requires the NumPy array, which is stored in raw form on secondary storage. In order to run the application in the Qualcomm Neural Processing SDK we should firstly have to do some basic image pre-processing to pass the input to the Qualcomm Neural Processing SDK.

The Qualcomm Neural Processing SDK expects the image to be in NumPy array stored in secondary storage. We will discuss pre-processing of the input images using OpenCV.

  1. Resize the image with the shape of 513x513x3
  2. Convert the image type float32 after padding the smaller dimensions to the mean value of 128
  3. The padding is used to produce an image of 513x513x3
  4. Multiply the image element wise with 0.00784313771874 and subtract 1.0 respectively
  5. Store this preprocessed array as a raw file

Here is the code that represents the above-mentioned pre-processing steps for the input image,

    import numpy as np
    import cv2
    frame = cv2.imread('image.jpg')
    # Resize frame with Required image size
    frame_resized = cv2.resize(frame,(513,513))
    # Adding Mean & Multiplying with 0.7843
    blob = cv2.dnn.blobFromImage(frame_resized, 0.007843, (513, 513), (127.5, 127.5, 127.5), swapRB=True)
    # Making numpy array of required shape
    blob = np.reshape(blob, (1,513,513,3))
    # Storing to the file
    np.ndarray.tofile(blob, open('blob.raw','w') )
  

On executing the above script, blob.raw file is generated.

Procedure to change the background using DeepLab

On completion of the preprocessing the image, the generated ArgMax:0.raw will be stored inside the output/Result_0 directory. Below is the detailed description of how to change the background for a pre-processed image.

  1. The output of the DeepLabV3 model is 513x513x1 NumPy array.
  2. Read the output file as a float32
  3. Each element has the predicted class number of the corresponding pixels.
  4. Replace the background of image resized according to the output of the array predicted
  5. Resize the image with original size

Below script will change the background to grayscale for a pre-processed image

  import cv2
  import numpy as np
  arr = np.fromfile(open('ArgMax:0.raw', 'r'), dtype="float32")
  arr = np.reshape(arr, (513,513,1))
  segment = arr[342:, 342:]
  arr[arr == 15] = 255
  original_img = cv2.imread('image.jpg')
  arr2=cv2.resize(segment,(original_img.shape[1], original_img.shape[0]))
  print(arr.shape)
  for i in range(arr2.shape[0]):
  for j in range(arr2.shape[1]):
  if (arr2[i][j] != 255):
  original_img[i][j] = original_img[i][j][0] = original_img[i][j][1] = original_img[i][j][2]
  cv2.imshow('output1', original_img)
  cv2.imwrite('changed_bg_img.jpg', original_img)
  cv2.imshow('output', arr)
  cv2.imwrite('actual_out.jpg', arr)
  cv2.imwrite('single_segment.jpg', segment)
  cv2.waitKey(0)

Below are the images before and after changing the background respectively,

Original Image with Color Background.

Altered Image with Black and White Background.

NameEmailTitle/Company
Rakesh Sankar[email protected]Sr. System Architect,
GlobalEdge Software, Ltd
Shivanand Pujar[email protected]Project Manager,
GlobalEdge Software, Ltd
Akshay Kulkarni[email protected]Technical Lead,
GlobalEdge Software, Ltd
Sushant Ahuja[email protected]Sr. Software Engineer,
GlobalEdge Software, Ltd
Jinka Venkata Saikiran[email protected]Sr. Software Engineer,
GlobalEdge Software, Ltd
Sahil Munaf Bandar[email protected]Software Engineer,
GlobalEdge Software, Ltd

Snapdragon and Qualcomm Neural Processing SDK are products of Qualcomm Technologies, Inc. and/or its subsidiaries.