Hello, everyone, I am in early winter, I always want to study artificial intelligence, but have been watching, last issue I wrote a “a front-end image processing secrets”, thinking of strike while the iron is hot, so the full energy, I have the idea of front-end face, then write a face changing Demo, so there are the following pictures of the Demo.
Today, we will take this DEMO to see how to use TFJS + Canvas to realize front-end face change.
Technical analysis
Front end in (her) face to face is to get the scope, but now only the AI can play a key role, because the key lies in human face recognition and calculation of the size and location of the facial features, which can meet the needs of our powerful and unconstrained style, how to make it as simple to destroy the snap on the means?
First of all, we need to accurately obtain the image (picture or video) whether there is a face, the other is the boundary of the face and the position of the five senses. For example: detection of face feature points, we call the face of the “neural network”.
Secondly, as the chief surgeon, we got the report on the position of the facial features, communicated with the customer to clarify their needs, and conducted careful analysis again. Finally, we felt relieved to boldly take the “knife”.
Don’t forget to take anaesthetic, I forgot several times, the client passed out in pain and save the cost of anaesthetic. I am such a thrifty doctor.
In addition to getting the location of the facial features, we also need to know the size of the facial features, to tell the truth, it is a bit difficult, not all people’s eyes are as precise as Du Haitao and Li Ronghao (easy to measure under a microscope), not all people’s mouth like Yao Chen and Shu Qi that swallow mountains and mountains.
I believe that some partners have been immersed in anxiety, need not anxiety friends, anyway passive knife is not you 🤪, the method is more than difficult, right, move to say again.
Technology selection
In order to obtain the position of the facial features, first to recognize the face, and then to obtain the range of facial features and their location. Here we have to rely on the power of some AI intelligence, about intelligent face detection library preliminary research down about face-api.js, tracking. Js, clmTrackr. Js, TFJS.
At the same time, I started from FACE-API step by step, full of the confidence of the primary surgeon, full of expectation, tried one by one, and failed again and again. I almost gave up from the beginning, thinking that my generation of miracle doctors would never decline? Suddenly I rose from the ruins of self-doubt, and after reviewing the preparation and workflow, I found that the warehouse was three years old. Maybe it was my fault, and I still resolutely abandoned face-api.js, track.js, clmTrackr.js. Put the entrusted color into TFJS.
Soon I found some trained models on the official website of TFJS, which included the face detection model we needed. Is that coincidence? No, it opened the door of cosmetic surgery for us, and Google did not disappoint me 😄.
After long and enjoyable testing, I found that the Blazface library can help us detect:
- The start and end coordinates of the face range
- The position of the left and right eyes
- The position of the left and right ears
- Position of nose
- The position of the mouth
A picture is worth a thousand words. Seeing is believing.
The technical implementation
Wouldn’t it be too noisy to just talk about theory, which is not my Style? As for how to implement it technically, we can start with a simple DEMO and add 💗 to the eye.
I’d like to start with a sketch.
The whole implementation process can be divided into the following steps:
- Introduce TFJS and load AI model (face recognition)
- Get information about all the faces in the image
- Calculate the size of each person’s eyes
- Canvas draws the image and adds 💗 above the eye
Here we use a map instead of ImageData that changes the image. Of course, we can also directly change the ImageData, but luo xiang said that it is not recommended. If you directly change the ImageData, not only the calculation is very large, leading to the drawing lag, and even cause the browser to freeze; In addition, the model can not guarantee the accuracy at present, resulting in image distortion. After all, we now rely on the coordinates provided by AI model analysis, and we expect the model to be more perfect and accurate in the future.
Step 1: Import the Blazface Detector library and load the AI model
The Blazface Detector library relies on TFJS, so we need to load TFJS first.
Two ways of introduction
NPM import
const blazeface = require('@tensorflow-models/blazeface');
Copy the code
The script tag
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/blazeface"></script>
Copy the code
Loading AI model
⚠️ domestic partners need ladders, because models need to be loaded from TFHUB, and they are expected to be able to choose which model to load from in the future (the proposal has been accepted first).
Async function main() {const model = await blazeface.load(); // TODO... } main();Copy the code
Step 2: Get information about all the faces in the image
After ensuring that the model is loaded, we can use the estimateFaces method to detect all the face information in the image, which will return an array, the number of array is the number of detected faces, each face information is an object, which contains the following information:
- TopLeft face ↖️ Angle boundary coordinates
- bottomRightCoordinates of face ↘️ Angle boundary. Can be combined with
topLeft
Calculate the width and height of the face. - Probability Accuracy.
- An array of landmarks containing the locations of the five senses. They represent right eye, left eye, nose, mouth, right ear, and left ear in that order.
The estimateFaces method receives two parameters, which are:
- inputDOM nodes or DOM objects. Can be
video
aliveimage
. - returnTensorsBoolean type, return data type. If it is
false
Return specific values, x, y coordinates, etc. If it istrue
, returns an object.
Example:
// We pass the video or image object (or node) to the model
const input = document.querySelector("img");
// This method will return an array containing the boundary box of the face, the coordinate of five senses, each item of the array corresponds to a face.
const predictions = await model.estimateFaces(input, false);
/* `predictions` is an array of objects describing each detected face, for example: [ { topLeft: [232.28, 145.26], bottomRight: [449.75, 308.36], Probability: [0.998], Landmarks: [[295.13, 177.64], / / right eye [382.32, 175.56], the coordinates of / / left [341.18, 205.03], the coordinates of the eyes/nose / [345.12, 250.61], the coordinates of / [252.76, 211.37], the coordinates of/mouth / / right ear [431.20, 204.93] / / the coordinates of the coordinates of the left ear]}] * /
Copy the code
Step 3: Calculate the size of each person’s eyes
In step 2, we have obtained the coordinate positions of each person’s eyes. Next, we need to calculate the size of the eye. If you are careful, you may have noticed that the data analyzed by the model does not provide the attribute of eye size, so how can we judge the size of the eye? In figure 3 above, we can see under the eyes are the coordinates of eyelid, nose coordinates is the location of the nose, the mouth is the center point, and there are certain deviation, observe carefully, we found that the Angle will also affect the size of the eyes, but there is a common phenomenon is that the border to the height of the eyes, in half of the height to offset this, probably is the position of the eyes. So the size of the eye is equal to the y-coordinate of the eye minus the y-coordinate of the upper boundary.
for (let i = 0; i < predictions.length; i++) {
const start = predictions[i].topLeft;
const end = predictions[i].bottomRight;
const size = [end[0] - start[0], end[1] - start[1]].const rightEyeP = predictions[i].landmarks[0];
const leftEyeP = predictions[i].landmarks[1];
// The size of the eyes
const fontSize = rightEyeP[1] - start[1];
context.font = `${fontSize}px/${fontSize}px serif`;
}
Copy the code
Step 4: Draw an image on canvas and 💗
Because here we are using the map, so we need to draw the original image first, and then in the position of the eyes, through CanvasRenderingContext2D. FillText () the way, will we 💗 draw up, here we can also use the way of the picture, see you like them, I think text is faster because images need to load 😛.
// Omit the step of drawing the original image here, see the source code for details
// ...
// Walk through the array of face information
for (let i = 0; i < predictions.length; i++) {
const start = predictions[i].topLeft;
const end = predictions[i].bottomRight;
const size = [end[0] - start[0], end[1] - start[1]].const rightEyeP = predictions[i].landmarks[0];
const leftEyeP = predictions[i].landmarks[1];
// See love
const fontSize = rightEyeP[1] - start[1];
context.font = `${fontSize}px/${fontSize}px serif`;
context.fillStyle = 'red';
context.fillText('❤ ️', rightEyeP[0] - fontSize / 2, rightEyeP[1]);
context.fillText('❤ ️', leftEyeP[0] - fontSize / 2, leftEyeP[1]);
}
Copy the code
Source code hands up, can’t wait to try it? Don’t worry, I’ve also provided some other interesting demos below, so don’t forget to like and bookmark them.
The advanced
This article only explains some introductory image processing technology, high image processing, far more than we imagine responsible, but also need related algorithms, interested partners, you can go to the relevant information documents. For example, you can do some image tracking algorithm, image processing algorithm, binarization, 265 color to gray and so on. In addition, I want to help you solidify by sharing a few DMEOs and implementing them from point to point.
Epidemic prevention talent
The epidemic is raging, but it does not go. In the hot summer, everyone must be suffocated by wearing masks. I would like to add an invisible mask for you. Go ahead, find a PNG mask with no background color, and then we can start our performance.
Steps:
- Get the mouth position from the model
- Calculate the width of the mouth
- Draw images and masks on canvas
Analysis:
- Mask is a picture, need through CanvasRenderingContext2D drawImage () method
- The center of the mask is about the same as the center of the mouth
for (let i = 0; i < predictions.length; i++) {
const start = predictions[i].topLeft;
const end = predictions[i].bottomRight;
const size = [end[0] - start[0], end[1] - start[1]].const rightEyeP = predictions[i].landmarks[0];
const leftEyeP = predictions[i].landmarks[1];
const noseP = predictions[i].landmarks[2];
const mouseP = predictions[i].landmarks[3];
const rightEarP = predictions[i].landmarks[4];
const leftEarP = predictions[i].landmarks[5];
// Epidemic prevention expert
const image = new Image();
image.src = "./assets/images/mouthMask.png";
image.onload = function() {
const top = noseP[1] - start[1];
const left = start[0];
// The mouth is the center point, half of the top and half of the bottom
context.drawImage(image, mouseP[0] - size[0] / 2, mouseP[1] - size[1] / 2, size[0], size[1]); }}Copy the code
The source code
Flaming lips
Steps:
- Get the mouth position from the model
- Calculate the width of the mouth
- Draw images on canvas and 👄
Analysis: Again, the model does not return the mouth size, and everyone’s mouth size is different, so try to find a breakthrough:
- The nose and the mouth might be on the same Y axis
- The width of your ears cannot be the height of your mouth
- Eye width? The height of the mouth?
[A flash of insight] Yes, the width of my eyes seems to be about the same as the width of my mouth, so I turned my curious eyes to my colleague. I took out a container of Unopened Kobick potato chips from the cabinet on the left side and observed that: Their mouth is about as wide as the space between their eyes as long as they don’t open it. There are a few exceptions, but I’m pretty sure it’s the golden ratio.
As a result, we have the following code:
for (let i = 0; i < predictions.length; i++) {
const start = predictions[i].topLeft;
const end = predictions[i].bottomRight;
const size = [end[0] - start[0], end[1] - start[1]].const rightEyeP = predictions[i].landmarks[0];
const leftEyeP = predictions[i].landmarks[1];
// Add a flaming lip
Your mouth is about the width before your eyes
const fontSize = Math.abs(rightEyeP[0] - leftEyeP[0]);
context.font = `${fontSize}px/${fontSize}px serif`;
context.fillStyle = 'red';
context.fillText('👄', mouseP[0] - fontSize / 2, mouseP[1] + fontSize / 2);
}
Copy the code
The source code
Video processing
In fact, the processing method of video is the same, the difference is that the picture only needs to be drawn once, but the video, we need to draw every frame of the video (picture), and then do the second processing.
Analysis:
- The size of the canvas is equal to the size of the video, and the size of the video needs to be
onload
And then you can get - We need to process each frame of the video, because it’s a loop, so we might need to use it
setTimeout
、setInterval
orrequestAnimationFrame
Obviously setInterval is not suitable, and if there is a large amount of data,setTimeout
May affect the response speed, so we need to chooserequestAnimationFrame
I’m going to loop.
Steps:
- Load the video
- The video has loaded successfully. Initialize canvas
- After initialization, the AI model is loaded
- Process every frame of the video
The source code
Summary of technical knowledge
Tensorflow. js is a JavaScript library developed by Google’s open source machine learning platform (TFJS for short).
We use the face detection model Blazeface Detector provided by TFJS.
Blazeface Detector provides two methods, both of which return a Promise object:
- Load, you need to get the latest model from the tensor flow hub, so you need to have a secure Internet connection, and if you don’t have a secure Internet connection, open source to see if there’s a mirror address, and then you can map it to the mirror address by changing your local host
- EstimateFaces is used to detect face information. It detects all face information and returns an array. Each element in the array represents a person’s face information, including face boundary information and facial position.
In addition, we also use some basic common sense to the canvas, CanvasRenderingContext2D. FillText (), CanvasRenderingContext2D. The drawImage ().
Homework after class
My friends, how about a pair of 🕶️ for yourself?
Below, you can send your works to my email or initiate Pull requests on Github. If you pass the review, it will be included in my Demo.
If you find my article interesting, please leave a comment and let me know your thoughts and opinions.
Problem to collect
- No DEMO effect?
- ⚠️ needs scientific Internet access, because the model needs to be loaded from TFHUB, and it is expected that we can choose where to load the model from in the future (the proposal has been accepted).
- Use nginx, Python, and Node-server to provide a local service to prevent cross-domain resources (some resources need to be obtained locally).
- If you want to collect detect-video or camera, please check whether there is a local camera first. If not, please use the MP4 provided by the project to test.
- Other problems
- Waiting for you to mention
A link to the
- A front-end image processing secrets
- TensorFlow JavaScript model
- Blazeface detector