Skill Level | Area of Focus | Operating System | Software Tools |
---|---|---|---|
Intermediate | Artificial Intelligence, Computer Vision | Android | Neural Processing SDK for AI |
The project is designed to utilize the Qualcomm® Neural Processing SDK which is used to convert trained models from Caffe, Caffe2, ONNX, TensorFlow to Snapdragon supported format (.dlc format). We further utilize these models to create an application that performs semantic segmentation using DeepLab V3+.
Objective
The main objective is to create a machine learning application which can perform selective background manipulation on an image according to the user needs by using architectures such as DeepLabV3.
Materials Required / Parts List / Tools
Source Code / Source Examples / Application Executable
Build / Assembly Instructions
Requirements
Below are the items used in this project.
- Mobile Display with QC Image Segmentation app
- Snapdragon Mobile Hardware Development Kit.
- USB type-C cable
- External camera setup
- Power Cable
Deploying the project
- Download code from the GitHub Repository.
- Compile the code and run the application from Android Studio to generate application (apk) file.
How does it work?
The Image Segmentation application opens a camera preview, clicks a picture and converts it to bitmap. The network is built via Neural Network builder by passing deeplabv3.dlc as the input. The bitmap is then given to model for inference, which returns FloatTensor output. The output is again set for post-processing to achieve background manipulation (changing the background color to black and white) of the original input image.
Prerequisite for Camera Preview.
Permission to obtain camera preview frames is granted in the following file:
<uses-permission android:name="android.permission.CAMERA" />
In order to use camera2 APIs, add the below feature
<uses-feature android:name="android.hardware.camera2" />
Loading Model
Code snippet for neural network connection and loading model:
@Override protected NeuralNetwork doInBackground(File... params) { final SNPE.NeuralNetworkBuilder builder = new SNPE.NeuralNetworkBuilder(mApplicationContext) // Allows selecting a runtime order for the network. // In the example below use DSP and fall back, in order, to GPU then CPU // depending on whether any of the runtimes are available. .setRuntimeOrder(DSP, GPU, CPU) // Loads a model from DLC file .setModel(new File("<model-path>")) // Build the network network = builder.build(); }
Capturing Preview using Camera2 API:
Texture view is used to render camera preview. TextureView.SurfaceTextureListener
is an interface which can be used to notify when the surface texture associated with this texture view is available.
private final TextureView.SurfaceTextureListener mSurfaceTextureListener = new TextureView.SurfaceTextureListener() { @Override public void onSurfaceTextureAvailable(SurfaceTexture texture, int width, int height) { //check runtime permission if (ContextCompat.checkSelfPermission(getActivity(), Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) { requestCameraPermission(); return; } CameraManager manager = (CameraManager) activity.getSystemService(Context.CAMERA_SERVICE); try { if (!mCameraOpenCloseLock.tryAcquire(2500, TimeUnit.MILLISECONDS)) { throw new RuntimeException("Time out waiting to lock camera opening."); } //mCameraId = 1(Front Camera), mCameraId = 0(Rear Camera) manager.openCamera(mCameraId, mStateCallback, mBackgroundHandler); } catch (CameraAccessException e) { e.printStackTrace(); } catch (InterruptedException e) { throw new RuntimeException("Interrupted while trying to lock camera opening.", e); } }
Camera Callbacks
Camera call back, CameraDevice.StateCallback
is used for receiving updates about the state of a camera device. In the below overridden method, surface texture is created to capture the preview and obtain the frames.
@Override public void onOpened(@NonNull CameraDevice cameraDevice) { // This method is called when the camera is opened. We start camera preview here. mCameraOpenCloseLock.release(); mCameraDevice = cameraDevice; mPreviewRequestBuilder=mCameraDevice.createCaptureRequest(CameraDevice.TEMPLATE_PREVIEW); mPreviewRequestBuilder.addTarget(surface); }
Getting image data from ImageReader
The ImageReader
class allows direct application access to image data rendered into a Surface. The application uses this class to fetch the file path of the image created.
private final ImageReader.OnImageAvailableListener mOnImageAvailableListener = new ImageReader.OnImageAvailableListener() { @Override public void onImageAvailable(ImageReader reader) { mBackgroundHandler.post(new ImageSaver(reader.acquireNextImage(), mFile)); } };
Getting Bitmap from Texture view
Bitmap of fixed height and width can be obtained from TextureView
in onCaptureCompleted callback usingTotalCaptureResult
. That bitmap can be compressed and sent to model as input.
private CameraCaptureSession.CaptureCallback mCaptureCallback = new CameraCaptureSession.CaptureCallback() { @Override public void onCaptureCompleted(@NonNull CameraCaptureSession session, @NonNull CaptureRequest request, @NonNull TotalCaptureResult result) { //Get file path from ImageReader String stringUri = mFile.getAbsolutePath(); Bitmap mBitmap = BitmapFactory.decodeFile(stringUri); } }
Object Inference
The bitmap image is converted to RGBA byte array of size 513*513*3. Basic image processing depends on the kind of input shape required by the model; then converting that processed image into the tensor is required. The prediction API requires a tensor format with type Float which returns object prediction in Map<String, FloatTensor>
object.
private Map<String, FloatTensor> inferenceOnBitmap(Bitmap scaledBitmap) { final Map<String, FloatTensor> outputs; try { // safety check if (mNeuralnetwork == null || mInputTensorReused == null || scaledBitmap.getWidth() != getInputTensorWidth() || scaledBitmap.getHeight() != getInputTensorHeight()) { Logger.d("SNPEHelper", "No NN loaded, or image size different than tensor size"); return null; } // [0.3ms] Bitmap to RGBA byte array (size: 513*513*3 (RGBA..)) mBitmapToFloatHelper.bitmapToBuffer(scaledBitmap); // [2ms] Pre-processing: Bitmap (513,513,4 ints) -> Float Input Tensor (513,513,3 floats) final float[] inputFloatsHW3 = mBitmapToFloatHelper.bufferToNormalFloatsBGR(); if (mBitmapToFloatHelper.isFloatBufferBlack()) return null; mInputTensorReused.write(inputFloatsHW3, 0, inputFloatsHW3.length, 0, 0); // [31ms on GPU16, 50ms on GPU] execute the inference outputs = mNeuralnetwork.execute(mInputTensorsMap); } catch (Exception e) { e.printStackTrace(); Logger.d("SNPEHelper", e.getCause() + ""); return null; } return outputs; }
Process pixels RGBA(0..255) to BGR(-1..1)
public float[] bufferToNormalFloatsBGR() { final byte[] inputArrayHW4 = mByteBufferHW4.array(); final int area = mFloatBufferHW3.length / 3; long sumG = 0; int srcIdx = 0, dstIdx = 0; final float inputScale = 0.00784313771874f; for (int i = 0; i < area; i++) { // NOTE: the 0xFF a "cast" to unsigned int (otherwise it will be negative numbers for bright colors) final int pixelR = inputArrayHW4[srcIdx] & 0xFF; final int pixelG = inputArrayHW4[srcIdx + 1] & 0xFF; final int pixelB = inputArrayHW4[srcIdx + 2] & 0xFF; mFloatBufferHW3[dstIdx] = inputScale * (float) pixelB - 1; mFloatBufferHW3[dstIdx + 1] = inputScale * (float) pixelG - 1; mFloatBufferHW3[dstIdx + 2] = inputScale * (float) pixelR - 1; srcIdx += 4; dstIdx += 3; sumG += pixelG; } // the buffer is black if on average on average Green < 13/255 (aka: 5%) mIsFloatBufferBlack = sumG < (area * 13); return mFloatBufferHW3; }
Image Segmentation
The output FloatTensor
is further processed for background manipulation. The Float Matrix will be 0 for the background and 15 for the object detected. Based on this matrix, output bitmap is created with pixels of the background as black/white while retaining the pixels of object.
MNETSSD_NUM_BOXES = mOutputs.get(MNETSSD_OUTPUT_LAYER).getSize(); // convert tensors to boxes - Note: Optimized to read-all upfront mOutputs.get(MNETSSD_OUTPUT_LAYER).read(floatOutput, 0, MNETSSD_NUM_BOXES); //for black/white image int w = mScaledBitmap.getWidth(); int h = mScaledBitmap.getHeight(); int b = 0xFF; int out = 0xFF; for (int y = 0; y < h; y++) { for (int x = 0; x < w; x++) { b = b & mScaledBitmap.getPixel(x, y); for (int i = 1; i <= 3 && floatOutput[y * w + x] != 15; i++) { out = out << (8) | b; } mScaledBitmap.setPixel(x, y, floatOutput[y * w + x] != 15 ? out : mScaledBitmap.getPixel(x, y)); out = 0xFF; b = 0xFF; } } mOutputBitmap = Bitmap.createScaledBitmap(mScaledBitmap, originalBitmapW, originalBitmapH, true);
Below are the images before and after Segmentation
Usage Instructions
Prerequisites to install Android application
- You will need an Android phone with version 7.0 or above.
- Install ADB on your Windows/ Linux system. Instructions can be found here: https://developer.android.com/studio/command-line/adb.html
Application Installation
- Get the Application APK
- ADB tool can be used to install the Application (on both Windows and Linux)
adb install qdn_segmentation.apk
- Run the Application in the phone.
Snapdragon and Qualcomm Neural Processing SDK are products of Qualcomm Technologies, Inc. and/or its subsidiaries.