Please pay attention to the wechat official account “AI Front”, (ID: AI-front)
Imagine a future where you can feel the person in front of you through a hologram while you’re on a remote phone call. Now, a similar application based on computer vision has been created.
Over the past decade, many important groups of computer vision researchers have made exciting advances in 3D face reconstruction and face alignment. The most important one is to apply convolutional neural network as an artificial neural network. However, due to the imperfect 3D face model used for mapping, the reconstructed space has limitations, which leads to the effect of most 3D face reconstruction methods is not ideal.
Above is the structure of PRN, with green rectangles representing residual blocks and blue rectangles representing transpose convolution layers
In a recent paper, Yao Feng’s team proposes an end-to-end approach, positional mapping Network (PRN), that combines dense alignment with reconstructed 3D shapes. In terms of 3D face alignment and reconstruction, this method far outperforms previous methods on multiple databases.
In more detail that they designed a UV location map (https://en.wikipedia.org/wiki/UV_mapping), it is a kind of record for all facial point cloud three-dimensional coordinates of two-dimensional images, retained in each UV polygons in semantic information. A weighted loss is then used to train a simple coding-decoding network that produces a UV position map from a single 2D face image.
The image above is a partial result of this method, with odd lines being the result of face alignment (showing only 68 points) and even lines being the result of A 3D reconstruction
The contributions of this paper are mainly in the following aspects:
-
It is the first time to solve the problems of face alignment and 3d face reconstruction by using an end-to-end approach, which is not limited by low dimensional space.
-
In order to directly obtain 3D face structure and dense alignment, we developed a representation method called UV position plane, which records the position information of 3D face and provides semantic information corresponding to each point in UV space.
-
In the training phase, we propose a weight mask, which assigns different weights to each point on the position map and calculates the weight loss. Experiments show that this design is helpful to improve the performance of the network.
-
Finally, we provide a lightweight framework capable of 100fps that can be used to reconstruct a face in 3D directly from a single 2D face image.
-
In the AFLW2000-3D database and Florence database, this method achieved a relative 25% improvement in both 3D face reconstruction and dense face alignment compared to the current best methods.
The code for this method is done using Tensorflow’s Python interface. The project is the official website: https://github.com/YadiraF/PRNet. If you want to test the effect of face reconstruction, you need to install the following environment:
-
Python 2.7 (including numpy,skimage,scipy libraries)
-
The Tensorflow version must be larger than 1.4
-
Dlib (for face detection, you don’t need to install this extension if you can provide a bounding box for a face)
-
Opencv2 (for displaying test results)
The trained model can be downloaded here: Baidu network location (https://pan.baidu.com/s/10vuV7m00OHLcsihaC-Adsw), Google network location (https://drive.google.com/file/d/1UoE-XuW1SDLUjZmJPkIZ1MLxvQFgmTFH/view?usp=sharing), at present this code is still in the process of development, The team will continue to improve and provide more flexible features in the future.
-
For face alignment: Dense alignment (68 points) for both visible and non-visible facial key points.
-
3D face reconstruction: obtain key points of 3D face model and corresponding colors through a single picture. The result can be saved as mesh data with the suffix.obj, which can be loaded directly in Meshlab or Microsoft 3D Builder. Of course, the texture of the invisible area will be distorted by self-occlusion.
-
3D pose estimation: Instead of using only 68 key points for face pose detection, more accurate pose prediction results can be obtained by using all key points in the 3D model (over 40,000).
-
Depth image:
-
Texture editing: Data enhancement can be performed, and for the input face image, the texture of its specific area can be changed, such as the eyes:
-
Face swap: The use of another person’s face to replace the face in a particular image and adapt to the posture of the face in that image.
-
Cloning project
-
Use Baidu web disk or Google web disk to download the trained PRN models and store them in the path Data/net-data
-
Run the test code
-
Use your own image tests
Run Python demo.py –help to get more help.
Original paper:
https://arxiv.org/pdf/1803.07835.pdf
英文原文 :
3D Face Reconstruction with Position Map Regression Networks
https://heartbeat.fritz.ai/3d-face-reconstruction-with-position-map-regression-networks-