The idea of a so-called “master face” — a set of fake images generated by machine learning algorithms that impersonate people to hack facial biometrics — made compelling headlines overseas last week. But a closer look at the study reveals obvious weaknesses, and it’s unlikely to work in the real world.
“A Master Face is an image of a human face that passes facial-based authentication for the majority of the population,” explains the paper, which was posted on arXiv earlier this month. “These faces can be used to impersonate any user, with a high probability of success, without accessing any user information.”
The tel Aviv University academics went on to say that they built a model that generated nine Master faces capable of representing 40 percent of the population, bypassing “three leading deep face recognition systems.” At first glance, this seems impressive, and these claims pose a clear security risk in applications that require facial recognition.
First, the team used Nvidia’s StyleGAN system to create realistic images of fake faces. Each fake output was compared with a real image of 5,749 different people represented in the Labeled Faces in the Wild (LFW) dataset. Separate classifier algorithms determine how similar the artificial intelligence-generated fake faces are compared to real faces in the dataset.
The images with high similarity score of classifier are retained and the other images are discarded. The scores are used to train evolutionary algorithms to use StyleGAN to create more and more spoof faces that look like people in the dataset.
Over time, the researchers were able to find a set of host faces that represented as many images as possible in the data set. In short, they could only use 9 images to represent 40% of the 5,749 different people Labeled Faces in the Wild dataset.
Next, they used these host faces to trick three different facial recognition models: Dlib, FaceNet, and SphereFace. These systems ranked highest in a competition to benchmark the best face matching algorithms tested on LFW datasets.
However, a quick look at the main faces that were able to bypass the highest scores in each of the three models reveals significant limitations of the study. They were almost always fake pictures of white older men with white hair, glasses and beards. If these images of the same type can represent a large LFW dataset, there must be some flaws in that dataset.
A disclaimer posted on the website hosting the data set confirms this: “Many groups are not well represented in LFW. For example, there are very few children, no babies, very few people over 80 and a relatively small proportion of women. In addition, many ethnic groups have very low or no representation at all.”
The scores of the nine Master faces reflect the limitations of the LFW dataset. Women, darker skin and younger faces ranked lower and were less likely to bypass the three models tested.
“While LFW can theoretically be used to assess the performance of certain subgroups, the database is designed so that there is not enough data to draw strong statistical conclusions about subgroups. Simply put, LFW is not large enough to demonstrate that a particular piece of software has been thoroughly tested, “according to another disclaimer listed on LFW’s website.
While the idea of a Master Face that can simulate most faces to unlock facial recognition systems is intriguing, the study here is just another case of training and testing machine learning models using flawed data. Garbage in, garbage out, as they say.
The LFW data set lacks diversity, so a computer-generated host face is more likely to cover a larger percentage of that data set. These images are unlikely to work in the real world.
“LFW does suffer from the limitations described on its official website, but despite these limitations, LFW remains a widely used data set in academic literature for evaluating face recognition methods,” Tomer Friedlander, co-author and researcher of The paper at Tel Aviv University’s School of Electrical Engineering, told The Register.
“Our paper raises potential vulnerabilities in facial recognition systems that could be exploited by attackers. Therefore, developers and users of facial recognition methods should take this into account. We haven’t tested our approach against commercial face recognition systems used in real life, so we can’t reference real life systems.”
He says it is possible to adapt models to better data sets that are more diverse to try and cheat real-world systems. “We are interested in further exploring the possibility of using the main face generated by our method to help protect existing facial recognition systems from such attacks. We’re saving that for future research.”
Don’t be fooled by the lurid headlines claiming that these master faces can break into “over 40% of facial ID authentication systems” or that they are “very successful.” There is little evidence to support these claims.
About face recognition technology
The purpose of face recognition is to extract personalized features from human face image and identify human identity.
- Feature based face detection technology
Face detection is carried out by using color, contour, texture, structure or histogram features.
- Face detection technology based on template matching
Face template is extracted from the database, and then a certain template matching strategy is adopted to capture the face image and extract the picture from the template library to match, by the correlation of the level and matched template size to determine the size of the face and location information.
- Face detection technology based on statistics
By collecting a large number of positive and negative samples of “face” and “non-face” images, using statistical methods to strengthen training of the system, so as to realize the detection and classification of face and non-face patterns.
Because the distribution of face image in high dimensional space is an irregular manifold distribution, the sample that can be obtained is only a very small part of the face image space sampling, how to solve the problem of statistical learning under small sample needs further research. It is worth mentioning that no algorithm will be 100% accurate in recognition. Due to the influence of noise, error, algorithm, training set, image background, character action and so on, sometimes there will be some wrong recognition, resulting in inaccurate face detection in the video.
Face recognition applications
At present, the application of face recognition technology in China is mainly concentrated in three areas: attendance access control, security and finance. Specific such as: security monitoring, video face detection, face recognition, traffic statistics, etc., widely used in the community, building intelligent access control, perimeter suspicious wandering detection, scenic spot traffic statistics and so on.
TSINGSEE black rhino video based on many years of experience in the field of video technology, fused AI detection, intelligent recognition technology to the various application scenarios, typical examples such as EasyCVR video fusion cloud services, with AI face recognition, license plate recognition, speech, sound and light alarm, monitoring video intercom, yuntai control the ability of data analysis and summary.