DeepLabV3+ Android App

Skill LevelArea of FocusOperating SystemSoftware Tools
IntermediateArtificial Intelligence, Computer VisionAndroidNeural Processing SDK for AI

The project is designed to utilize the Qualcomm® Neural Processing SDK which is used to convert trained models from Caffe, Caffe2, ONNX, TensorFlow to Snapdragon supported format (.dlc format). We further utilize these models to create an application that performs semantic segmentation using DeepLab V3+.

The main objective is to create a machine learning application which can perform selective background manipulation on an image according to the user needs by using architectures such as DeepLabV3.

Requirements

Below are the items used in this project.

  1. Mobile Display with QC Image Segmentation app
  2. Snapdragon Mobile Hardware Development Kit.
  3. USB type-C cable
  4. External camera setup
  5. Power Cable

Deploying the project

  1. Download code from the GitHub Repository.
  2. Compile the code and run the application from Android Studio to generate application (apk) file.

How does it work?

The Image Segmentation application opens a camera preview, clicks a picture and converts it to bitmap. The network is built via Neural Network builder by passing deeplabv3.dlc as the input. The bitmap is then given to model for inference, which returns FloatTensor output. The output is again set for post-processing to achieve background manipulation (changing the background color to black and white) of the original input image.

Prerequisite for Camera Preview.

Permission to obtain camera preview frames is granted in the following file:

https://github.com/globaledgesoft/deeplabv3-application-using-neural-processing-sdk/blob/master/AndroidApplication/app/src/main/AndroidManifest.xml

  <uses-permission android:name="android.permission.CAMERA" />

In order to use camera2 APIs, add the below feature

  <uses-feature android:name="android.hardware.camera2" />

Loading Model

Code snippet for neural network connection and loading model:

  @Override
  protected NeuralNetwork doInBackground(File... params) {
  final SNPE.NeuralNetworkBuilder builder = new
  SNPE.NeuralNetworkBuilder(mApplicationContext)
  // Allows selecting a runtime order for the network.
  // In the example below use DSP and fall back, in order, to GPU then CPU
  // depending on whether any of the runtimes are available.
  .setRuntimeOrder(DSP, GPU, CPU)
  // Loads a model from DLC file
  .setModel(new File("<model-path>"))
  // Build the network
  network = builder.build();
  }

Capturing Preview using Camera2 API:

Texture view is used to render camera preview. TextureView.SurfaceTextureListener is an interface which can be used to notify when the surface texture associated with this texture view is available.

  private final TextureView.SurfaceTextureListener mSurfaceTextureListener
          = new TextureView.SurfaceTextureListener() {
      @Override
      public void onSurfaceTextureAvailable(SurfaceTexture texture, int width, int height) {
         //check runtime permission
  if (ContextCompat.checkSelfPermission(getActivity(), Manifest.permission.CAMERA)
          != PackageManager.PERMISSION_GRANTED) {
      requestCameraPermission();
      return;
  }
  
  CameraManager manager = (CameraManager) activity.getSystemService(Context.CAMERA_SERVICE);
  try {
      if (!mCameraOpenCloseLock.tryAcquire(2500, TimeUnit.MILLISECONDS)) {
          throw new RuntimeException("Time out waiting to lock camera opening.");
      }
  
  //mCameraId = 1(Front Camera), mCameraId = 0(Rear Camera)
      manager.openCamera(mCameraId, mStateCallback, mBackgroundHandler);
  } catch (CameraAccessException e) {
      e.printStackTrace();
  } catch (InterruptedException e) {
      throw new RuntimeException("Interrupted while trying to lock camera opening.", e);
  } 
  }

Camera Callbacks

Camera call back, CameraDevice.StateCallbackis used for receiving updates about the state of a camera device. In the below overridden method, surface texture is created to capture the preview and obtain the frames.

  @Override
public void onOpened(@NonNull CameraDevice cameraDevice) {
  // This method is called when the camera is opened.  We start camera preview here.
  mCameraOpenCloseLock.release();
  mCameraDevice = cameraDevice;  mPreviewRequestBuilder=mCameraDevice.createCaptureRequest(CameraDevice.TEMPLATE_PREVIEW);
mPreviewRequestBuilder.addTarget(surface);
}

Getting image data from ImageReader

The ImageReader class allows direct application access to image data rendered into a Surface. The application uses this class to fetch the file path of the image created.

  private final ImageReader.OnImageAvailableListener mOnImageAvailableListener
          = new ImageReader.OnImageAvailableListener() {
      @Override
      public void onImageAvailable(ImageReader reader) {
          mBackgroundHandler.post(new ImageSaver(reader.acquireNextImage(), mFile));
      }
  };

Getting Bitmap from Texture view

Bitmap of fixed height and width can be obtained from TextureView in onCaptureCompleted callback usingTotalCaptureResult. That bitmap can be compressed and sent to model as input.

  private CameraCaptureSession.CaptureCallback mCaptureCallback = new CameraCaptureSession.CaptureCallback() {
  @Override
  public void onCaptureCompleted(@NonNull CameraCaptureSession session,
                                 @NonNull CaptureRequest request,
                                 @NonNull TotalCaptureResult result) {
  //Get file path from ImageReader 
  String stringUri = mFile.getAbsolutePath();
  Bitmap mBitmap = BitmapFactory.decodeFile(stringUri);
  }
  }

Object Inference

The bitmap image is converted to RGBA byte array of size 513*513*3. Basic image processing depends on the kind of input shape required by the model; then converting that processed image into the tensor is required. The prediction API requires a tensor format with type Float which returns object prediction in Map<String, FloatTensor> object.

  private Map<String, FloatTensor> inferenceOnBitmap(Bitmap scaledBitmap) {
      final Map<String, FloatTensor> outputs;
      try {
          // safety check
          if (mNeuralnetwork == null || mInputTensorReused == null || scaledBitmap.getWidth() != getInputTensorWidth() || scaledBitmap.getHeight() != getInputTensorHeight()) {
              Logger.d("SNPEHelper", "No NN loaded, or image size different than tensor size");
              return null;
          }
          // [0.3ms] Bitmap to RGBA byte array (size: 513*513*3 (RGBA..))
          mBitmapToFloatHelper.bitmapToBuffer(scaledBitmap);
          // [2ms] Pre-processing: Bitmap (513,513,4 ints) -> Float Input Tensor (513,513,3 floats)
          final float[] inputFloatsHW3 = mBitmapToFloatHelper.bufferToNormalFloatsBGR();
          if (mBitmapToFloatHelper.isFloatBufferBlack())
              return null;
          mInputTensorReused.write(inputFloatsHW3, 0, inputFloatsHW3.length, 0, 0);
          // [31ms on GPU16, 50ms on GPU] execute the inference
          outputs = mNeuralnetwork.execute(mInputTensorsMap);
      } catch (Exception e) {
          e.printStackTrace();
          Logger.d("SNPEHelper", e.getCause() + "");
          return null;
      }
      return outputs;
  }

Process pixels RGBA(0..255) to BGR(-1..1)

  public float[] bufferToNormalFloatsBGR() {
      
      final byte[] inputArrayHW4 = mByteBufferHW4.array();
      final int area = mFloatBufferHW3.length / 3;
      long sumG = 0;
      int srcIdx = 0, dstIdx = 0;
      final float inputScale = 0.00784313771874f;
      for (int i = 0; i < area; i++) {
          // NOTE: the 0xFF a "cast" to unsigned int (otherwise it will be negative numbers for bright colors)
          final int pixelR = inputArrayHW4[srcIdx] & 0xFF;
          final int pixelG = inputArrayHW4[srcIdx + 1] & 0xFF;
          final int pixelB = inputArrayHW4[srcIdx + 2] & 0xFF;
          mFloatBufferHW3[dstIdx] = inputScale * (float) pixelB - 1;
          mFloatBufferHW3[dstIdx + 1] = inputScale * (float) pixelG - 1;
          mFloatBufferHW3[dstIdx + 2] = inputScale * (float) pixelR - 1;
          srcIdx += 4;
          dstIdx += 3;
          sumG += pixelG;
      }
      // the buffer is black if on average on average Green < 13/255 (aka: 5%)
      mIsFloatBufferBlack = sumG < (area * 13);
      return mFloatBufferHW3;
  }

Image Segmentation

The output FloatTensor is further processed for background manipulation. The Float Matrix will be 0 for the background and 15 for the object detected. Based on this matrix, output bitmap is created with pixels of the background as black/white while retaining the pixels of object.

  MNETSSD_NUM_BOXES = mOutputs.get(MNETSSD_OUTPUT_LAYER).getSize();
  // convert tensors to boxes - Note: Optimized to read-all upfront
  mOutputs.get(MNETSSD_OUTPUT_LAYER).read(floatOutput, 0, MNETSSD_NUM_BOXES);
  //for black/white image
  int w = mScaledBitmap.getWidth();
  int h = mScaledBitmap.getHeight();
  int b = 0xFF;
  int out = 0xFF;
  for (int y = 0; y < h; y++) {
      for (int x = 0; x < w; x++) {
          b = b & mScaledBitmap.getPixel(x, y);
          for (int i = 1; i <= 3 && floatOutput[y * w + x] != 15; i++) {
              out = out << (8) | b;
          }
          mScaledBitmap.setPixel(x, y, floatOutput[y * w + x] != 15 ? out : mScaledBitmap.getPixel(x, y));
          out = 0xFF;
          b = 0xFF;
      }
  }
  mOutputBitmap = Bitmap.createScaledBitmap(mScaledBitmap, originalBitmapW,
          originalBitmapH, true);

Below are the images before and after Segmentation

Original Image

Segmented Image
(Background changed to Black & White)

Prerequisites to install Android application

Application Installation

  1. Get the Application APK
  2. ADB tool can be used to install the Application (on both Windows and Linux) adb install qdn_segmentation.apk
  3. Run the Application in the phone.

Snapdragon and Qualcomm Neural Processing SDK are products of Qualcomm Technologies, Inc. and/or its subsidiaries.