TensorFlow Lite + GPUImage for AI Background Blurring

The background blur function is studied in the recent project. The requirement is to write a GPUImage filter and combine it with TensorFLow Lite to realize the background blur function of the specified object in the picture. This part of the content is basically learned by reading official documents and exploring by myself. Here is a summary and sorting out a note. The content mainly includes Android access to TensorFlow Lite, running AI model to identify objects in the picture and blur their background. This article is the first part of TensorFlow Lite access and model use.

What is Tensorflow Lite

Tensorflow Lite is an official Google machine learning library that helps us run the Tensorflow Lite model on mobile devices. TensorFlow Lite consists of two components:

TensorFlow Lite consists of two main components:

The TensorFlow Lite interpreter, which runs specially optimized models on many different hardware types, including mobile phones, embedded Linux devices, and microcontrollers. The TensorFlow Lite converter, which converts TensorFlow models into an efficient form for use by the interpreter, and can introduce optimizations to improve binary size and performance.

The two components are a converter and an interpreter.

The converter, as the name suggests, converts the regular Tensorflow model into a model file suitable for Android. The interpreter is used to specifically import and run the model on the Android side. What we need to do is to access the Tensorflow Lite library in the project, and then use some API of the interpreter provided in the library to import the AI model file, run the model, and parse the resulting data.

Tensorflite access

Google provides a demo project for us, which includes the use of different models such as image recognition, image segmentation examples, you can directly refer to: Tensorflow Lite Android using demo

First, prepare the Tensorflow Lite model file, which on Android is a.tflite file. You can use the examples in the demo directly, or you can use the model files in Tflite format generated by the converter yourself.

Next add the following dependencies to the build.gradle file in the project Module:

    // TensorFlow Lite core library
    implementation "Org. Tensorflow: tensorflow - lite: 2.1.0." "
    // TensorFlow Lite helper library (optional)
    implementation "Org. Tensorflow: tensorflow - lite - support: 0.0.0 - the nightly"

Copy the code

This is already available, but Google recommends that we simplify the NDK we rely on and continue to add the following configuration to the build.gradle file:

android {
    defaultConfig {
        ndk {
            abiFilters 'armeabi-v7a'.'arm64-v8a'}}}Copy the code

Finally, we take the tflite file (named model.tflite here) and move it to the SRC /main/ Assets directory under the project Model, as shown below:

Ok, the next step is to marry the import and use the model file in your code.

The specific use

After importing the model, there are roughly the following steps to use the model:

  1. Convert input data to an InputBuffer in the correct format
  2. Create an OutputBuffer ready to receive the results of the run
  3. Create an interpreter for the model and run the model
  4. Process the output data from the run model

Tensorflow Lite uses input and output data as buffers, which can be briefly described as Java NIO buffers. Next, the image segmentation model is taken as an example.

Processing of input data

In the image segmentation model, our input data is obviously a picture. The size of images is not fixed, and generally the input images required by the model are given size, so the first step is to transform images of different sizes into fixed size images by zooming or adding pixels. In fact, this is a Bitmap processing, which will not be detailed here.

Once the Bitmap is processed, the next step is to convert the Bitmap to a three-dimensional array. This three-dimensional array is made up of RGB values for each pixel in the image. For example, if the size of the image is 500 by 500, then the size of the three-dimensional array is 500 by 500 by 3. The code is as follows:


/** * Convert the Bitmap to an RGB array *@returnA three-dimensional array, the third dimension is the RGB value of the pixel */
fun Bitmap.rgbArray(a): Array<Array<FloatArray>> {
    val pixelsValues = IntArray(this.width * this.height)
    this.getPixels(pixelsValues, 0.this.width, 0.0.this.width, this.height)
    val result = Array(this.height) { Array(this.width) { FloatArray(3)}}var pixel = 0
    for (y in 0 until this.height) {
        for (x in 0 until this.width) {
            val value = pixelsValues[pixel++]   
            result[y][x][0] = ((value shr 16 and 0xFF).toFloat())//r
            result[y][x][1] = ((value shr 8 and 0xFF).toFloat())//g
            result[y][x][2] = ((value and 0xFF).toFloat())//b}}return result
}

Copy the code

Once you have this three-dimensional array, you need to convert it into a ByteBuffer. The size of the array is 500 * 500 * 3, so the size of the corresponding FloatBuffer is 500 * 500 * 3. It’s 500 times 500 times 3 times 4. In addition, generally speaking, the input data of the model requires a normalized preprocessing, which is actually a simple addition, subtraction, multiplication and division of the value. The code is as follows:


/** * Convert the RGB array to byteBuffer * needed by the model to run@paramArray RGB array *@paramNormalizeOp Normalized operation */
    private fun rgbToByteBuffer(array: Array<Array<FloatArray>>, width: Int, height: Int, normalizeOp: ((Float) - >Float) = { it }): ByteBuffer {
        val inputImage = ByteBuffer.allocateDirect(1 * width * height * 3 * 4)
        inputImage.order(ByteOrder.nativeOrder()).rewind()
        for (y in 0 until height) {
            for (x in 0 until width) {
                for (z in listOf(0.1.2)) {
                    val value = array[y][x][z]
                    inputImage.putFloat(normalizeOp(value))
                }
            }
        }
        inputImage.rewind()
        return inputImage
    }
    

    val input = rgbToByteBuffer(rgbArray, 500.500) { (it * 2 / 255F) - 1 }

    
Copy the code

After converting to ByteBuffer, it will be OK. This is the input data we need.

Preparing output data

The input data and output data are also Bytebuffers, but of different sizes. The Buffer size of the output data depends on the model. For example, in this image segmentation model, the output data should be a two-dimensional array of type Long with a size of 500 * 500. The corresponding LongBuffer size is 500 by 500. Since 1 Long = 8 bytes, the final output Buffer should be a 500 * 500 * 8 ByteBuffer with the following code:


val output = ByteBuffer.allocateDirect(TF_DPI_SEG * TF_DPI_SEG * 8).order(ByteOrder.nativeOrder()).asLongBuffer()

Copy the code

Create the interpreter and run it

The Tensorflow Lite Interpreter is simply created as follows:


val byteBuffer = FileUtil.loadMappedFile(context, "model.tflite")
val options = Interpreter.Options()
val interpreter = Interpreter(byteBuffer,options)

Copy the code

The constructor takes two parameters, the first to convert the model file to ByteBuffer, and the second to specify some configuration Options that are currently required. The conversion of model files to ByteBuffer can be handled directly with the utility classes provided in the Support library.

Once the interpreter object is created, call its run() method and pass in the previously prepared input and output data:

/ /...
interpreter.run(input, output)
/ /...

Copy the code

Processing of output data

Finally, the processing of the output data. Said earlier, our goal is to find objects in the background is done in the blur, so need to use the original image, through the model to generate a mark or trace the target figure of the object, the target object in the image should be a kind of color, the target object is another kind of color, this convenient at the back of the handle. Therefore, after obtaining the output data ByteBuffer, it is necessary to first convert it into a conventional one-dimensional or two-dimensional array, and then generate a Bitmap according to the array. The code is as follows:

/ /...
var array = IntArray(output.limit()) {
    output[it].toInt()
}

Copy the code

The above code converts the output data into an Int array, where each value represents the object type of the corresponding pixel, such as 1 for a person, 2 for a dog, 3 for a cat, and so on. The next step is to use an array to generate a Bitmap. The code is as follows:


/** * Generate bitmap * from an RGB array@paramThe process of converting each value in the pixelsOp array to a specific pixel value */
fun <T> createMask(array: Array<T>, width: Int, height: Int, pixelsOp: (T) - >Int): Bitmap {
    val conf = Bitmap.Config.ARGB_8888
    val maskBitmap = Bitmap.createBitmap(width, height, conf)

    for (y in 0 until height) {
        for (x in 0 until width) {
            val value = array[x + width * y]
            val color = pixelsOp(value)
            maskBitmap.setPixel(x, y, color)
        }
    }

    return maskBitmap
}

val mask = createMask(array.toTypedArray(), TF_DPI_SEG, TF_DPI_SEG) { value ->
            Color.argb(
                255.// Based on the actual situation
                , 0.0)}Copy the code

According to the actual situation, for example, the target object is a cat, then the part belonging to the cat is black, and the part not belonging to the cat is red, so the final output Bitmap is:

The original:

Results: