preface
In the last article, we made the face in the video automatically wear a mask. In this article, we will talk about how to restore the face under the condition of face occlusion.
Face editing vs network SC-Fegan
When it comes to image editing, everyone is familiar with Photoshop, which can handle almost all daily photos, but PS is not so easy to operate, and mastering PS requires professional knowledge. How to let the small white complete in the image of the outline can achieve the image of the editor? This task can certainly be given to deep learning.
1. Introduction to the paper
Sc-fegan, which has a fully convolutional network, can carry out end-to-end training. The proposed network uses the Sn-Patchgan discriminator to resolve and improve the edge of disharmony. The system not only has a general GAN loss, but also a style loss, allowing editing of various parts of the face image even in the case of large area missing.
To summarize sc-Fegan’s contribution:
- Use a network architecture similar to U-NET and Gated convolutional Layers. For training and testing phases, this architecture is easier, faster, and produces superior and nuanced results compared to rough networks.
- Create free form domain data for decoder diagrams, color diagrams and sketches that can handle incomplete image data input rather than rigid form input.
- The Sn-Patchgan discriminator was applied and additional style loss was made to the model. The model is adaptable to erasing most cases and shows robustness in managing mask edges. It also allows the generation of image details, such as high quality synthetic hairstyles and earrings.
Read the paper and the open source code for more details
https://github.com/run-youngjoo/SC-FEGAN
Copy the code
2. Download source code
Cloning code
git clone https://github.com/run-youngjoo/SC-FEGAN
Copy the code
Download the model on Google Drive,
https://drive.google.com/open?id=1VPsYuIK_DY3Gw07LEjUhg2LwbEDlFpq1
Copy the code
Place them in the CKPT directory
mv /${HOME}/SC-FEGAN.ckpt.* /${HOME}/ckpt/
Copy the code
3. Modify the configuration file
Modify the demo.yaml file and set GPU_NUM: 0
INPUT_SIZE: 512
BATCH_SIZE: 1
GPU_NUM: 0
# directories
CKPT_DIR: './ckpt/SC-FEGAN.ckpt'
Copy the code
Qt5 interface framework
1. Install the Qt5
sudo apt install qt5-default qtcreator qtmultimedia5-dev libqt5serialport5-dev
pip install pyqt5
Copy the code
2. Test window
import sys
from PyQt5 import QtCore, QtWidgets
from PyQt5.QtWidgets import QMainWindow, QLabel, QGridLayout, QWidget
from PyQt5.QtCore import QSize
class HelloWindow(QMainWindow) :
def __init__(self) :
QMainWindow.__init__(self)
self.setMinimumSize(QSize(640.480))
self.setWindowTitle("Hello world")
centralWidget = QWidget(self)
self.setCentralWidget(centralWidget)
gridLayout = QGridLayout(self)
centralWidget.setLayout(gridLayout)
title = QLabel("Hello World from PyQt", self)
title.setAlignment(QtCore.Qt.AlignCenter)
gridLayout.addWidget(title, 0.0)
if __name__ == "__main__":
app = QtWidgets.QApplication(sys.argv)
mainWin = HelloWindow()
mainWin.show()
sys.exit( app.exec_() )
Copy the code
Everything is ready.
Tensorflow
1. Model Reasoning (OOM)
source my_envs/tensorflow/bin/activate
cd SC-FEGAN/
python3 demo.py
Copy the code
The weight file, 353M, was read into the model and ran out of memory. A closer look at the memory usage shows that although the Raspberry PI has 8GB of memory, due to the 32-bit operating system limitations we installed, the single process can only use 4GB of memory.
2. Raspberry PI 32-bit system and 64-bit system selection
Why not choose a 64-bit operating system is mainly based on the following points:
- The current 64-bit system is still very unstable, the official is still frequent updates;
- The commonly used software is aimed at the ARM platform of raspberry PI, which is mostly released as a 32-bit system. The open source software can be compiled by itself, while the commercial software can only wait for updates (such as VPN software).
- Some peripheral hardware drivers still don’t support 64-bit (like USB cameras);
- Reasoning is usually done on the Raspberry PI, not training, and there are few application scenarios that require more than 4g memory;
- For raspberry PI computing, tensorFlow Lite is a better fit, significantly reducing memory and speeding up reasoning.
But with the release of an 8GB version of the Raspberry PI, the 64-bit system will slowly mature and let the bullets fly for a while……
The official 64-bit system has already been released. If you are interested, you can download it and give it a try.
http://downloads.raspberrypi.org/raspios_arm64/images/
Copy the code
Also consider Ubuntu 64-bit systems, which take full advantage of 8GB of ram.
https://ubuntu.com/download/raspberry-pi
Copy the code
It takes 5.5 gigabytes of memory to run this model, which is theoretically possible on a 64-bit operating system using an 8 gigabyte version of the Raspberry PI.
Tensorflow lite
1. Find the I/O layer
Sc-fegan input is 5 images, which are real images, monochrome wireframes, color wireframes, noise images and mask images. First visualize, and pay attention to the value range of each image.
def visualImageOutput(self, real, sketch, stroke, noise, mask) :
""" : Param Real: True Image: Param Sketch: monochrome Wireframing: Param noise: Param mask: Mask :return: """
temp = (real + 1) * 127.5
temp = np.asarray(temp[0, :, :, :], dtype=np.uint8)
cv2.imwrite('real.jpg', temp)
temp = sketch * 255
temp = np.asarray(temp[0,,,,,0], dtype=np.uint8)
cv2.imwrite('sketch.jpg', temp)
temp = (stroke + 1) * 127.5
temp = np.asarray(temp[0, :, :, :], dtype=np.uint8)
cv2.imwrite('stroke.jpg', temp)
temp = noise * 255
temp = np.asarray(temp[0,,,,,0], dtype=np.uint8)
cv2.imwrite('noise.jpg', temp)
temp = mask * 255
temp = np.asarray(temp[0,,,,,0], dtype=np.uint8)
cv2.imwrite('mask.jpg', temp)
Copy the code
The output is a picture generated by GAN, and the corresponding output layer is Tanh.
# Name of the node 0 - real_images
# Name of the node 1 - sketches
# Name of the node 2 - color
# Name of the node 3 - masks
# Name of the node 4 - noises
# Name of the node 1003 - generator/Tanh
Copy the code
Remember the serial number and name of these network layers for easy access.
2. CKPT is converted to PB
Where add is an alias for the custom output layer, save_model.pb is the required serialized static network diagram file.
from tensorflow.python.framework import graph_util
constant_graph = graph_util.convert_variables_to_constants(self.sess, self.sess.graph_def,['add'])
Write the serialized PB file
with tf.gfile.FastGFile('saved_model.pb', mode='wb') as f:
f.write(constant_graph.SerializeToString())
print('save'.'saved_model.pb')
Copy the code
3. Convert Pb to TFLite
Knowing the name of the input/output layer, you can easily get the model_pb.tflite file.
path = 'saved_model.pb'
inputs = ['real_images'.'sketches'.'color'.'masks'.'noises'] The input node name of the model file
outputs = ['add'] The output node name of the model file
converter = tf.contrib.lite.TocoConverter.from_frozen_graph(path, inputs, outputs)
# converter.post_training_quantize = True
tflite_model = converter.convert()
open("model_pb.tflite"."wb").write(tflite_model)
print('tflite convert done.')
Copy the code
You can set post_training_quantize to enable compression and reduce the size of the model by about half.
Once we’re done on the laptop or server, we copy the Tflite file to the raspberry PI.
4. Activate tensorFlow Lite
deactivate
source ~/my_envs/tf_lite/bin/activate
Copy the code
5. Read the model
# Load TFLite model and allocate tensors.
model_path = "./model_pb.tflite"
self.interpreter = tflite.Interpreter(model_path=model_path)
self.interpreter.allocate_tensors()
# Get input and output tensors.
self.input_details = self.interpreter.get_input_details()
self.output_details = self.interpreter.get_output_details()
Copy the code
6. Model reasoning
Note the batch data structure here, to match the dimensions of different input images to split the NUMPY matrix.
And then you fill in the tensor in the order that the network defines it.
real_images, sketches, color, masks, noises, _ = np.split(batch.astype(np.float32), [3.4.7.8.9], axis=3)
# Fill data
interpreter.set_tensor(input_details[0] ['index'], real_images)
interpreter.set_tensor(input_details[1] ['index'], sketches)
interpreter.set_tensor(input_details[2] ['index'], color)
interpreter.set_tensor(input_details[3] ['index'], masks)
interpreter.set_tensor(input_details[4] ['index'], noises)
# call model
interpreter.invoke()
# output image
result = interpreter.get_tensor(output_details[0] ['index'])
Copy the code
The output layer restores normalized data to image format.
result = (result + 1) * 127.5
result = np.asarray(result[0, :, :, :], dtype=np.uint8)
self.output_img = result
Copy the code
7. Program operation
python3 demo_tflite_rpi.py
Copy the code
Click Open Image to Open the Image file, then click Mask, paint out the glasses, and then click Complete.
Reasoning on the Raspberry PI takes 18 seconds, which is not bad, and requires only about 1 gigabyte of memory.
Tip:
Note that the opened image does not contain a Chinese path, otherwise it will not be able to get the image object.
Android Application Deployment
Now that you have converted to TensorFlow Lite, you can easily deploy it to your mobile phone by following our previous tutorial.
1. Configure dependencies
Configure the TensorFlow Lite dependencies for Build. gradle
Def tfl_version = "0.0.0 - the nightly" implementation (" org. Tensorflow: tensorflow - lite: ${tfl_version} ") {changing = true} implementation("org.tensorflow:tensorflow-lite-gpu:${tfl_version}") { changing = true }Copy the code
2. Read the model
ModelName is configured to model_pb.tflite and the getInterpreter function is used to import the network model.
@Throws(IOException::class)
private fun getInterpreter(
context: Context,
modelName: String,
useGpu: Boolean = false
): Interpreter {
val tfliteOptions = Interpreter.Options()
tfliteOptions.setNumThreads(numberThreads)
gpuDelegate = null
if (useGpu) {
gpuDelegate = GpuDelegate()
tfliteOptions.addDelegate(gpuDelegate)
}
tfliteOptions.setNumThreads(numberThreads)
return Interpreter(loadModelFile(context, modelName), tfliteOptions)
}
Copy the code
3. Model reasoning
The core code is an array of images with input and output defined
// inputRealImage 1 x 512 x 512 x 3 x 4
// inputSketches 1 x 512 x 512 x 1 x 4
// inputStroke 1 x 512 x 512 x 3 x 4
// inputMask 1 x 512 x 512 x 1 x 4
// inputNoises 1 x 512 x 512 x 1 x 4
val inputs = arrayOf<Any>(inputRealImage, inputSketches, inputStroke, inputMask, inputNoises)
val outputs = HashMap<Int, Any>()
val outputImage =
Array(1) { Array(CONTENT_IMAGE_SIZE) { Array(
CONTENT_IMAGE_SIZE
) { FloatArray(3) } } }
outputs[0] = outputImage
Log.i(TAG, "init image"+inputRealImage)
styleTransferTime = SystemClock.uptimeMillis()
interpreterTransform.runForMultipleInputsOutputs(
inputs,
outputs
)
styleTransferTime = SystemClock.uptimeMillis() - styleTransferTime
Log.d(TAG, "Style apply Time to run: $styleTransferTime")
Copy the code
4. Packaging APP
Glasses can’t hide your beauty now.
Perfect!
Well, from automatically putting on the mask to automatically removing the mask, this kind of side-to-side interaction AI application is not only interesting, but also can enhance each other’s data and improve the effect of each other’s models.
Download the source code
The relevant documents of this issue can be downloaded through the official account “Deep Awakening” and the background reply: “RPI13”.
Next up
In the next article, we will introduce another pair of applications, zhou Botong’s magic skills. Let’s learn together! Stay tuned for…