preface

In the last article, we made the face in the video automatically wear a mask. In this article, we will talk about how to restore the face under the condition of face occlusion.

Face editing vs network SC-Fegan

When it comes to image editing, everyone is familiar with Photoshop, which can handle almost all daily photos, but PS is not so easy to operate, and mastering PS requires professional knowledge. How to let the small white complete in the image of the outline can achieve the image of the editor? This task can certainly be given to deep learning.

1. Introduction to the paper

Sc-fegan, which has a fully convolutional network, can carry out end-to-end training. The proposed network uses the Sn-Patchgan discriminator to resolve and improve the edge of disharmony. The system not only has a general GAN loss, but also a style loss, allowing editing of various parts of the face image even in the case of large area missing.

To summarize sc-Fegan’s contribution:

  • Use a network architecture similar to U-NET and Gated convolutional Layers. For training and testing phases, this architecture is easier, faster, and produces superior and nuanced results compared to rough networks.
  • Create free form domain data for decoder diagrams, color diagrams and sketches that can handle incomplete image data input rather than rigid form input.
  • The Sn-Patchgan discriminator was applied and additional style loss was made to the model. The model is adaptable to erasing most cases and shows robustness in managing mask edges. It also allows the generation of image details, such as high quality synthetic hairstyles and earrings.

Read the paper and the open source code for more details

https://github.com/run-youngjoo/SC-FEGAN
Copy the code

2. Download source code

Cloning code

git clone https://github.com/run-youngjoo/SC-FEGAN
Copy the code

Download the model on Google Drive,

https://drive.google.com/open?id=1VPsYuIK_DY3Gw07LEjUhg2LwbEDlFpq1
Copy the code

Place them in the CKPT directory

mv /${HOME}/SC-FEGAN.ckpt.* /${HOME}/ckpt/
Copy the code

3. Modify the configuration file

Modify the demo.yaml file and set GPU_NUM: 0

INPUT_SIZE: 512
BATCH_SIZE: 1

GPU_NUM: 0

# directories
CKPT_DIR: './ckpt/SC-FEGAN.ckpt'
Copy the code

Qt5 interface framework

1. Install the Qt5

sudo apt install qt5-default qtcreator qtmultimedia5-dev libqt5serialport5-dev
pip install pyqt5
Copy the code

2. Test window

import sys
from PyQt5 import QtCore, QtWidgets
from PyQt5.QtWidgets import QMainWindow, QLabel, QGridLayout, QWidget
from PyQt5.QtCore import QSize    

class HelloWindow(QMainWindow) :
    def __init__(self) :
        QMainWindow.__init__(self)

        self.setMinimumSize(QSize(640.480))    
        self.setWindowTitle("Hello world") 

        centralWidget = QWidget(self)          
        self.setCentralWidget(centralWidget)   

        gridLayout = QGridLayout(self)     
        centralWidget.setLayout(gridLayout)  

        title = QLabel("Hello World from PyQt", self) 
        title.setAlignment(QtCore.Qt.AlignCenter) 
        gridLayout.addWidget(title, 0.0)

if __name__ == "__main__":
    app = QtWidgets.QApplication(sys.argv)
    mainWin = HelloWindow()
    mainWin.show()
    sys.exit( app.exec_() )
Copy the code

Everything is ready.

Tensorflow

1. Model Reasoning (OOM)

source my_envs/tensorflow/bin/activate
cd SC-FEGAN/
python3 demo.py 
Copy the code

The weight file, 353M, was read into the model and ran out of memory. A closer look at the memory usage shows that although the Raspberry PI has 8GB of memory, due to the 32-bit operating system limitations we installed, the single process can only use 4GB of memory.

2. Raspberry PI 32-bit system and 64-bit system selection

Why not choose a 64-bit operating system is mainly based on the following points:

  • The current 64-bit system is still very unstable, the official is still frequent updates;
  • The commonly used software is aimed at the ARM platform of raspberry PI, which is mostly released as a 32-bit system. The open source software can be compiled by itself, while the commercial software can only wait for updates (such as VPN software).
  • Some peripheral hardware drivers still don’t support 64-bit (like USB cameras);
  • Reasoning is usually done on the Raspberry PI, not training, and there are few application scenarios that require more than 4g memory;
  • For raspberry PI computing, tensorFlow Lite is a better fit, significantly reducing memory and speeding up reasoning.

But with the release of an 8GB version of the Raspberry PI, the 64-bit system will slowly mature and let the bullets fly for a while……

The official 64-bit system has already been released. If you are interested, you can download it and give it a try.

http://downloads.raspberrypi.org/raspios_arm64/images/
Copy the code

Also consider Ubuntu 64-bit systems, which take full advantage of 8GB of ram.

https://ubuntu.com/download/raspberry-pi
Copy the code

It takes 5.5 gigabytes of memory to run this model, which is theoretically possible on a 64-bit operating system using an 8 gigabyte version of the Raspberry PI.

Tensorflow lite

1. Find the I/O layer

Sc-fegan input is 5 images, which are real images, monochrome wireframes, color wireframes, noise images and mask images. First visualize, and pay attention to the value range of each image.

def visualImageOutput(self, real, sketch, stroke, noise, mask) :
    """ : Param Real: True Image: Param Sketch: monochrome Wireframing: Param noise: Param mask: Mask :return: """
    temp = (real + 1) * 127.5
    temp = np.asarray(temp[0, :, :, :], dtype=np.uint8)
    cv2.imwrite('real.jpg', temp)

    temp = sketch * 255
    temp = np.asarray(temp[0,,,,,0], dtype=np.uint8)
    cv2.imwrite('sketch.jpg', temp)

    temp = (stroke + 1) * 127.5
    temp = np.asarray(temp[0, :, :, :], dtype=np.uint8)
    cv2.imwrite('stroke.jpg', temp)

    temp = noise * 255
    temp = np.asarray(temp[0,,,,,0], dtype=np.uint8)
    cv2.imwrite('noise.jpg', temp)

    temp = mask * 255
    temp = np.asarray(temp[0,,,,,0], dtype=np.uint8)
    cv2.imwrite('mask.jpg', temp)
Copy the code

The output is a picture generated by GAN, and the corresponding output layer is Tanh.

# Name of the node 0 - real_images
# Name of the node 1 - sketches
# Name of the node 2 - color
# Name of the node 3 - masks
# Name of the node 4 - noises
# Name of the node 1003 - generator/Tanh
Copy the code

Remember the serial number and name of these network layers for easy access.

2. CKPT is converted to PB

Where add is an alias for the custom output layer, save_model.pb is the required serialized static network diagram file.

from tensorflow.python.framework import graph_util
constant_graph = graph_util.convert_variables_to_constants(self.sess, self.sess.graph_def,['add'])

Write the serialized PB file
with tf.gfile.FastGFile('saved_model.pb', mode='wb') as f:
    f.write(constant_graph.SerializeToString())
    print('save'.'saved_model.pb')
Copy the code

3. Convert Pb to TFLite

Knowing the name of the input/output layer, you can easily get the model_pb.tflite file.

path = 'saved_model.pb'
inputs = ['real_images'.'sketches'.'color'.'masks'.'noises']  The input node name of the model file
outputs = ['add']  The output node name of the model file
converter = tf.contrib.lite.TocoConverter.from_frozen_graph(path, inputs, outputs)
# converter.post_training_quantize = True
tflite_model = converter.convert()
open("model_pb.tflite"."wb").write(tflite_model)
print('tflite convert done.')
Copy the code

You can set post_training_quantize to enable compression and reduce the size of the model by about half.

Once we’re done on the laptop or server, we copy the Tflite file to the raspberry PI.

4. Activate tensorFlow Lite

deactivate
source ~/my_envs/tf_lite/bin/activate
Copy the code

5. Read the model

# Load TFLite model and allocate tensors.
model_path = "./model_pb.tflite"
self.interpreter = tflite.Interpreter(model_path=model_path)
self.interpreter.allocate_tensors()

# Get input and output tensors.
self.input_details = self.interpreter.get_input_details()
self.output_details = self.interpreter.get_output_details()
Copy the code

6. Model reasoning

Note the batch data structure here, to match the dimensions of different input images to split the NUMPY matrix.

And then you fill in the tensor in the order that the network defines it.

real_images, sketches, color, masks, noises, _ = np.split(batch.astype(np.float32), [3.4.7.8.9], axis=3)

# Fill data
interpreter.set_tensor(input_details[0] ['index'], real_images)
interpreter.set_tensor(input_details[1] ['index'], sketches)
interpreter.set_tensor(input_details[2] ['index'], color)
interpreter.set_tensor(input_details[3] ['index'], masks)
interpreter.set_tensor(input_details[4] ['index'], noises)

# call model
interpreter.invoke()

# output image
result = interpreter.get_tensor(output_details[0] ['index'])
Copy the code

The output layer restores normalized data to image format.

result = (result + 1) * 127.5
result = np.asarray(result[0, :, :, :], dtype=np.uint8)
self.output_img = result
Copy the code

7. Program operation

python3 demo_tflite_rpi.py 
Copy the code

Click Open Image to Open the Image file, then click Mask, paint out the glasses, and then click Complete.

Reasoning on the Raspberry PI takes 18 seconds, which is not bad, and requires only about 1 gigabyte of memory.

Tip:

Note that the opened image does not contain a Chinese path, otherwise it will not be able to get the image object.

Android Application Deployment

Now that you have converted to TensorFlow Lite, you can easily deploy it to your mobile phone by following our previous tutorial.

1. Configure dependencies

Configure the TensorFlow Lite dependencies for Build. gradle

Def tfl_version = "0.0.0 - the nightly" implementation (" org. Tensorflow: tensorflow - lite: ${tfl_version} ") {changing = true} implementation("org.tensorflow:tensorflow-lite-gpu:${tfl_version}") { changing = true }Copy the code

2. Read the model

ModelName is configured to model_pb.tflite and the getInterpreter function is used to import the network model.

@Throws(IOException::class)
private fun getInterpreter(
  context: Context,
  modelName: String,
  useGpu: Boolean = false
): Interpreter {
  val tfliteOptions = Interpreter.Options()
  tfliteOptions.setNumThreads(numberThreads)

  gpuDelegate = null
  if (useGpu) {
    gpuDelegate = GpuDelegate()
    tfliteOptions.addDelegate(gpuDelegate)
  }

  tfliteOptions.setNumThreads(numberThreads)
  return Interpreter(loadModelFile(context, modelName), tfliteOptions)
}
Copy the code

3. Model reasoning

The core code is an array of images with input and output defined

// inputRealImage 1 x 512 x 512 x 3 x 4
// inputSketches 1 x 512 x 512 x 1 x 4
// inputStroke 1 x 512 x 512 x 3 x 4
// inputMask 1 x 512 x 512 x 1 x 4
// inputNoises 1 x 512 x 512 x 1 x 4
val inputs = arrayOf<Any>(inputRealImage, inputSketches, inputStroke, inputMask, inputNoises)
val outputs = HashMap<Int, Any>()
val outputImage =
  Array(1) { Array(CONTENT_IMAGE_SIZE) { Array(
    CONTENT_IMAGE_SIZE
  ) { FloatArray(3) } } }

outputs[0] = outputImage
Log.i(TAG, "init image"+inputRealImage)

styleTransferTime = SystemClock.uptimeMillis()
interpreterTransform.runForMultipleInputsOutputs(
  inputs,
  outputs
)

styleTransferTime = SystemClock.uptimeMillis() - styleTransferTime
Log.d(TAG, "Style apply Time to run: $styleTransferTime")
Copy the code

4. Packaging APP

Glasses can’t hide your beauty now.

Perfect!

Well, from automatically putting on the mask to automatically removing the mask, this kind of side-to-side interaction AI application is not only interesting, but also can enhance each other’s data and improve the effect of each other’s models.

Download the source code

The relevant documents of this issue can be downloaded through the official account “Deep Awakening” and the background reply: “RPI13”.

Next up

In the next article, we will introduce another pair of applications, zhou Botong’s magic skills. Let’s learn together! Stay tuned for…