Big eyes and thin face mimicking A Douyin special effects camera

preface

This article is to explain the special effects camera in the realization of the big eyes thin face, the complete source can be viewed AwemeLike. To achieve thin face and big eyes, first need to obtain the face feature points, in this project is the use of Face++ face recognition library, it can obtain 106 personal face feature points, and then through the deformation algorithm can be achieved.

1. The thin face

The face thinning algorithm used in the project refers to this article and uses Shader in OpenGL to fine-tune face shapes such as face thinning and big eyes in real time

vec2 curveWarp(vec2 textureCoord, vec2 originPosition, vec2 targetPosition, floatDelta) {offset = vec2(0.0); Vec2 result = vec2 (0.0); vec2 direction = (targetPosition - originPosition) ;float radius = distance(vec2(targetPosition.x, targetPosition.y / aspectRatio), vec2(originPosition.x, originPosition.y / aspectRatio));
    float ratio = distance(vec2(textureCoord.x, textureCoord.y / aspectRatio), vec2(originPosition.x, originPosition.y / aspectRatio)) / radius;
    
    ratio = 1.0 - ratio;
    ratio = clamp(ratio, 0.0, 1.0);
    offset = direction * ratio * delta;
    
    result = textureCoord - offset;
    
    return result;
}
Copy the code

TextureCoord indicates the current coordinates to be modified, originPosition indicates the center coordinates, targetPosition indicates the target coordinates, and delta is used to control the intensity of the deformation.

The shader method above can be understood as follows: first determine a circle with originPosition as the center and the distance between targetPosition and originPosition as the radius, and then move the pixels within the circle by an offset value in the same direction. And the offset value is larger when it is closer to the center of the circle. Finally, the transformed coordinates are returned.

If we simplify the method to the expression that the transformed coordinates = the original coordinates – (target coordinates – center coordinates) * the intensity of the deformation, in other words, what the method does is to subtract an offset from the original coordinates, And (targetPosition-originPosition) determines the direction and maximum value of the move.

The deformation strength in the formula can be expressed as follows:

(targetPosition - originPosition) * (1- |textureCoord - originPosition| / |targetPosition - originPosition| ) * delta
Copy the code

In addition to the algorithm used in the project, there are two other face deformation algorithms that can be used. One is the local adjustment algorithm based on Interactive Image Warping (the principle of which can be seen in the article). The algorithm we used in the project can actually be regarded as a variant of it. All can be expressed by the coordinate after expression transformation = original coordinate – (target coordinate – center coordinate) * deformation strength, the difference is that the value of deformation strength is different. ; The other is a global point deformation algorithm based on Image deformation using moving least squares (its principle can be seen in the article).

When we want to use the face thinning algorithm mentioned above, we only need to select multiple pairs of feature points as originPosition and targetPosition, make them apply to the two cheeks and chin, and then control the strength of face thinning by changing delta.

In Face++, the 106 feature points obtained are distributed as follows

Upload the 106 feature points to the pixel shader

 uniform float facePoints[106 * 2];
Copy the code

Set uniform deformation intensity

 uniform float thinFaceDelta;
Copy the code

Specify center coordinates and target coordinates, 9 pairs in total

 vec2 thinFace(vec2 currentCoordinate) {
     
     vec2 faceIndexs[9];
     faceIndexs[0] = vec2(3., 44.);
     faceIndexs[1] = vec2(29., 44.);
     faceIndexs[2] = vec2(7., 45.);
     faceIndexs[3] = vec2(25., 45.);
     faceIndexs[4] = vec2(10., 46.);
     faceIndexs[5] = vec2(22., 46.);
     faceIndexs[6] = vec2(14., 49.);
     faceIndexs[7] = vec2(18., 49.);
     faceIndexs[8] = vec2(16., 49.);
     
     for(int i = 0; i < 9; i++)
     {
         int originIndex = int(faceIndexs[i].x);
         int targetIndex = int(faceIndexs[i].y);
         vec2 originPoint = vec2(facePoints[originIndex * 2], facePoints[originIndex * 2 + 1]);
         vec2 targetPoint = vec2(facePoints[targetIndex * 2], facePoints[targetIndex * 2 + 1]);
         currentCoordinate = curveWarp(currentCoordinate, originPoint, targetPoint, thinFaceDelta);
     }
     return currentCoordinate;
 }
Copy the code

2. The big eye

Big eyes algorithm is also referring to this article in OpenGL using Shader for real-time face fine-tuning such as thin face and big eyes

// Enlarge vec2 enlargeEye(VEC2 textureCoord, Vec2 originPosition);float radius, float delta) {
    
    floatweight = distance(vec2(textureCoord.x, textureCoord.y / aspectRatio), vec2(originPosition.x, originPosition.y / aspectRatio)) / radius; Weight = 1.0 - (1.0-weight * weight) * delta; Weight = clamp (weight, 0.0, 1.0); textureCoord = originPosition + (textureCoord - originPosition) * weight;return textureCoord;
}
Copy the code

TextureCoord indicates the current coordinates to be modified, originPosition indicates the center coordinates, RADIUS indicates the radius of the circle, and delta is used to control the intensity of the deformation. Similar to the thin face algorithm, a circle is determined according to originPosition and targetPosition, and the coordinates inside the circle are calculated, while the coordinates outside the circle remain unchanged. The final coordinates are completely determined by the value of weight. The larger the weight is, the less the final coordinates change. When the weight is 1, that is, the coordinates are at the boundary or outside the circle, the final coordinates remain unchanged. When the weight is less than 1, the final coordinate falls between the original coordinate and the dot, which means that the final returned pixel is closer to the dot than the original pixel, resulting in a zooming-centered effect.

3. Complete Shader code

NSString *const kGPUImageThinFaceFragmentShaderString = SHADER_STRING
(
 
 precision highp float;
 varying highp vec2 textureCoordinate;
 uniform sampler2D inputImageTexture;
 
 uniform int hasFace;
 uniform float facePoints[106 * 2];
 
 uniform highp float aspectRatio;
 uniform float thinFaceDelta;
 uniform floatbigEyeDelta; // Enlarge vec2 enlargeEye(VEC2 textureCoord, Vec2 originPosition);float radius, float delta) {
    
    floatweight = distance(vec2(textureCoord.x, textureCoord.y / aspectRatio), vec2(originPosition.x, originPosition.y / aspectRatio)) / radius; Weight = 1.0 - (1.0-weight * weight) * delta; Weight = clamp (weight, 0.0, 1.0); textureCoord = originPosition + (textureCoord - originPosition) * weight;returntextureCoord; Vec2 curveWarp(vec2 textureCoord, vec2 originPosition, vec2 targetPosition,floatDelta) {offset = vec2(0.0); Vec2 result = vec2 (0.0); vec2 direction = (targetPosition - originPosition) * delta;float radius = distance(vec2(targetPosition.x, targetPosition.y / aspectRatio), vec2(originPosition.x, originPosition.y / aspectRatio));
    float ratio = distance(vec2(textureCoord.x, textureCoord.y / aspectRatio), vec2(originPosition.x, originPosition.y / aspectRatio)) / radius;
    
    ratio = 1.0 - ratio;
    ratio = clamp(ratio, 0.0, 1.0);
    offset = direction * ratio;
    
    result = textureCoord - offset;
    
    return result;
}
 
 vec2 thinFace(vec2 currentCoordinate) {
     
     vec2 faceIndexs[9];
     faceIndexs[0] = vec2(3., 44.);
     faceIndexs[1] = vec2(29., 44.);
     faceIndexs[2] = vec2(7., 45.);
     faceIndexs[3] = vec2(25., 45.);
     faceIndexs[4] = vec2(10., 46.);
     faceIndexs[5] = vec2(22., 46.);
     faceIndexs[6] = vec2(14., 49.);
     faceIndexs[7] = vec2(18., 49.);
     faceIndexs[8] = vec2(16., 49.);
     
     for(int i = 0; i < 9; i++)
     {
         int originIndex = int(faceIndexs[i].x);
         int targetIndex = int(faceIndexs[i].y);
         vec2 originPoint = vec2(facePoints[originIndex * 2], facePoints[originIndex * 2 + 1]);
         vec2 targetPoint = vec2(facePoints[targetIndex * 2], facePoints[targetIndex * 2 + 1]);
         currentCoordinate = curveWarp(currentCoordinate, originPoint, targetPoint, thinFaceDelta);
     }
     return currentCoordinate;
 }
 
 vec2 bigEye(vec2 currentCoordinate) {
     
     vec2 faceIndexs[2];
     faceIndexs[0] = vec2(74., 72.);
     faceIndexs[1] = vec2(77., 75.);
     
     for(int i = 0; i < 2; i++)
     {
         int originIndex = int(faceIndexs[i].x);
         int targetIndex = int(faceIndexs[i].y);
         
         vec2 originPoint = vec2(facePoints[originIndex * 2], facePoints[originIndex * 2 + 1]);
         vec2 targetPoint = vec2(facePoints[targetIndex * 2], facePoints[targetIndex * 2 + 1]);
         
         float radius = distance(vec2(targetPoint.x, targetPoint.y / aspectRatio), vec2(originPoint.x, originPoint.y / aspectRatio));
         radius = radius * 5.;
         currentCoordinate = enlargeEye(currentCoordinate, originPoint, radius, bigEyeDelta);
     }
     return currentCoordinate;
 }

 void main()
 {
     vec2 positionToUse = textureCoordinate;
     
     if(hasFace == 1) { positionToUse = thinFace(positionToUse); positionToUse = bigEye(positionToUse); } gl_FragColor = texture2D(inputImageTexture, positionToUse); });Copy the code

4. Use of face++

FaceDetector in the project is a class specifically designed to handle Face++ related operations with the following header file

@interface FaceDetector : NSObject

@property (assign, nonatomic) BOOL isAuth;

@property (copy, nonatomic, readonly) NSArray<FaceModel *> *faceModels;
@property (strong, nonatomic, readonly) FaceModel *oneFace;
@property (assign, nonatomic, readonly) BOOL isWorking;
@property (assign, nonatomic) int faceOrientation;
@property (assign, nonatomic) FaceDetectorSampleBufferOrientation sampleBufferOrientation;
@property (assign, nonatomic) FaceDetectorSampleType sampleType;

+ (instancetype)shareInstance;
- (void)getLandmarksFromSampleBuffer:(CMSampleBufferRef)detectSampleBufferRef;
- (void)auth;
@end
Copy the code

4.1 license

Before using Face++, you need to replace key and secret. In the project, its path is Face++/ mgnetaccount. h, and then call the authorization method. After the authorization succeeds, Face++ can be used for face detection.

- (void)auth {
    self.isAuth = false;
    [MGFaceLicenseHandle licenseForNetwokrFinish:^(bool License, NSDate *sdkDate){
        if(! License) { NSLog(@"Network network authorization failed!!");
            self.isAuth = false;
        } else {
            NSLog(@"Network network authorization successful");
            self.isAuth = true; }}]; }Copy the code

4.2 Configuring the Video Frame Format

Face++ accepts video frames of CMSampleBufferRef type, but it does not support YUV format, so it needs to select BGRA format when decoding.

The project uses GPUImage for decoding, but the GPUImage library has some implementation problems when decoding video frames into BGRA format, so we must use GPUImageFaceCamera and GPUImageFaceMovie created by the project when using the camera and reading video files. They inherit GPUImageVideoCamera and GPUImageMovie from GPUImage, respectively, and internally rewrite some configuration methods so that the returned video frames are in BGRA format.

In addition, these two classes will call itself after get into video frames FaceDetector – (void) getLandmarksFromSampleBuffer (CMSampleBufferRef) detectSampleBufferRef; Method to obtain face information, so that we can directly use it in the filter class out of the face data.

4.3 Setting the Face Direction

Face direction refers to the counterclockwise offset Angle of the face in the video frame. The offset Angle 0 means that the face is positive and in the vertical direction.

- (void)setFaceOrientation:(int)faceOrientation {
    [self.markManager updateFaceppSetting:^(MGFaceppConfig *config) {
        config.orientation = faceOrientation;
    }];
}
Copy the code

If you give us a picture of a face, our naked eye is easy to determine the offset Angle of the face in the picture, but the transmission to face++ is a video frame from the camera or video file, so how should we get the offset Angle of the face?

4.3.1. Video Frames from camera shooting (`GPUImageFaceCamera`)

When shooting with a camera, the video frame generated by the camera is not the same as what we see by default. You can specify the orientation of the video frame using the AVCaptureConnection property videoOrientation

Typedef NS_ENUM (NSInteger AVCaptureVideoOrientation) {AVCaptureVideoOrientationPortrait = 1, / / horizontal upright when mobile phones and home button below, The camera produced the same picture as the original. AVCaptureVideoOrientationPortraitUpsideDown = 2, / / when the phone level upright, home button in the top, The image produced by a camera and the original same AVCaptureVideoOrientationLandscapeRight = 3, / / when the phone horizontally placed, home button on the right side, the image produced by a camera and the original image. AVCaptureVideoOrientationLandscapeLeft = 4, / / when the phone horizontally placed, the home button on the left, the image produced by a camera, if it is front-facing camera, and the relationship between the level of the original image is the mirror; If it is a rear camera, it is the same as the original picture. }Copy the code

The original image mentioned above is what we see with our naked eyes, and it may not be the same as the video frame produced by a camera or video file.

Since the project calls the camera through the GPUImage library, and GPUImage does not set this property — it uses the default value, the project uses the default value for this property. (The default value may be used for performance reasons, as setting this property causes the system to apply a corresponding matrix to rotate the video frame, and GPUImage selects the rotation operation to be performed on the GPU.)

Rear camera

When using the rear camera, the default value is AVCaptureVideoOrientationLandscapeRight videoOrientation properties, that is to say, when the phone horizontally placed, and the home button on the right, the image produced by a camera and the original image. According to this, we can get the following two transformation diagrams, the left picture represents the original picture (the actual scene), the right picture represents the video frame generated by the camera, which is also the video frame passed to face++, each line represents a direction to place the phone.

We need to get the offset Angle of the face according to the above two pictures

First of all, we need to set a premise, the face in the original picture is always in the vertical direction, that is, the offset Angle is 0, this is actually logical, our head can not tilt to more than 90 degrees. We can then consider 3 and 4 in the picture above as human faces, and calculate the rotation Angle of 3 and 4 in the picture on the right to get the offset Angle of the face in the current direction of the phone.

We ended up with the following comparison table

Direction of mobile phone	Camera position	Whether to flip horizontally	Face rotation Angle in picture (counterclockwise)
Vertical, home button down	The rear	false	90
Vertical, home button on top	The rear	false	270
Horizontal, home button on the left	The rear	false	180
Horizontal, home button on the right	The rear	false	0

Front facing camera

When using a front-facing camera, videoOrientation defaults to AVCaptureVideoOrientationLandscapeLeft, likewise, when the phone horizontally placed, the home button on the left, the image produced by a camera and the level of the original image is the mirror of relationship, So you need to do an additional horizontal flip so that the image looks exactly like the original.

It also gives the transformation of the original picture and video frame when the mobile phone is in the vertical or horizontal direction of shooting

Mobile phone orientation and face offset Angle mapping table

Direction of mobile phone	Camera position	Whether to flip horizontally	Face rotation Angle in picture (counterclockwise)
Vertical, home button down	Pre –	true	90
Vertical, home button on top	Pre –	true	270
Horizontal, home button on the left	Pre –	true	0
Horizontal, home button on the right	Pre –	true	180

Compared to the rear camera, need to do more when using front-facing camera an extra flip horizontal, because face++ does not provide a set of flip horizontal interface, so the identification front-facing camera images, face++ return to face data have a little problem, the order of the facial feature points around upside down, but this problem can be ignored, Because the sides of the face are aligned; The other problem is that the Euler Angle goes in the opposite direction, and we’ll talk about how to do that later.

Check the direction of the phone

Since the App only supports the Portrait direction, So I can’t use similar – (void) willRotateToInterfaceOrientation: (toInterfaceOrientation UIInterfaceOrientation) Duration :(NSTimeInterval)duration method to obtain the direction of the mobile phone. A better way is to use CoreMotion to detect the acceleration in the XYZ direction to determine the current orientation of the phone

- (void)startMotion { self.motionManager = [[CMMotionManager alloc] init]; The self. The motionManager. AccelerometerUpdateInterval = 0.3 f; NSOperationQueue *motionQueue = [[NSOperationQueue alloc] init]; [motionQueuesetName:@"com.megvii.gryo"];
    __weak typeof(self) weakSelf = self;
    [self.motionManager startAccelerometerUpdatesToQueue:motionQueue withHandler:^(CMAccelerometerData * _Nullable accelerometerData, NSError * _Nullable error) {
        
        __strong typeof(weakSelf) self = weakSelf;
        if(fabs accelerometerData. Acceleration. (z) > 0.7) {self. "orientation = 90; }else{
            if (AVCaptureDevicePositionBack == self->devicePosition) {
                if(fabs (accelerometerData. Acceleration. X) < 0.4) {self. "orientation = 90; }else if{(accelerometerData. Acceleration. X > 0.4). The self orientation = 180; }else if(accelerometerData. Acceleration. {x < 0.4). Self orientation = 0; }}else{
                if(fabs (accelerometerData. Acceleration. X) < 0.4) {self. "orientation = 90; }else if(accelerometerData. Acceleration. X > 0.4) {self. The orientation = 0; }else if(accelerometerData. Acceleration. {x < 0.4). The self orientation = 180; }}if{(accelerometerData. Acceleration. Y > 0.6). The self orientation = 270; }}}]; }Copy the code

4.3.2 Video Frames from Video Files (`GPUImageFaceMovie`)

The transformation matrix is obtained through the preferredTransform property of AVAssetTrack, and then the rotation Angle of the video is judged by the matrix, that is, the rotation Angle of the face.

- (NSInteger)orientation {
    NSInteger degree = 0;
    NSArray *tracks = [self.asset tracksWithMediaType:AVMediaTypeVideo];
    if([tracks count] > 0) {
        AVAssetTrack *videoTrack = [tracks objectAtIndex:0];
        CGAffineTransform t = videoTrack.preferredTransform;
        
        if(t.a = = 0 && t.b & & tc = = = = 1.0-1.0 && t.d = = 0) {/ / Portrait degree = 90; }else if(t.a = = 0 && t.b = = 1.0 && tc = = 1.0 && t.d = = 0) {/ / PortraitUpsideDown degree = 270; }else if(t.a = = 1.0 && t.b = = 0 && tc = = 0 && t.d = = 1.0) {/ / LandscapeRight degree = 0; }else if(t.a = = 1.0 && t.b = = 0 && tc = = 0 && t.d = = 1.0) {/ / LandscapeLeft degree = 180; }}return degree;
}
Copy the code

4.4 Transform face feature point coordinates and Euler Angle

Where will the face data returned by face++ be used?

GPUImage will upload the video frame to the texture and then pass the texture to subsequent Targets. Targets are the classes that comply with the GPUImageInput protocol and are referred to here as filter classes.

Face++ return face data will only be used in these filter classes, these filter class texture picture and passed to face++ to do face detection of the video frame is not the same, that is to say, the reference frame generated face data and the use of face data reference frame is different. So before using face data, we also need to do some conversion operations on face data.

How to convert face data

Before we set the face orientation, we already know the video frame, so what is the texture in the filter class? Please refer to the following picture

The first line uses the rear camera, the second line uses the front camera; The first column and the second column represent the original image and the video frame generated by the camera respectively; The third column represents what happens when the video frame is uploaded to the texture. Since The origin of OpenGL is in the lower left corner, it needs to be upside down. The fourth column represents the transformed texture image, that is, the texture image in the filter class.

How (GPUImage from above, the third to the fourth column column transformation, check GPUImageVideoCamera method updateOrientationSendToTargets, According to the configuration of this project to GPUImageVideoCamera (outputImageOrientation = UIInterfaceOrientationPortrait, _horizontallyMirrorFrontFacingCamera = true, front), when using the rear camera, the enumeration is to specify the rotation kGPUImageRotateRight, when using a front-facing camera, Enumeration is to specify the rotation kGPUImageRotateRightFlipVertical, both enumeration name happens to be reverse transformation – the fourth column transformation to the third column needed step)

Transformation of the coordinates of feature points

Assume that point represents the coordinate of the feature point of Face++ when the rear camera is used, corresponding to the video frame of the first row in the figure above, which is based on position 4, that is to say, the value of point is relative to position 4. Then we look at the final texture image of the first row, the origin is position 3, and position 4 is transformed to the lower right corner. So what we need to do is we need to transform point to start at position 3.

Y, point. X), where width is the length of position 4 and position 2 of the video frame in the second column, height is the length of position 4 and position 3 of the video frame on the edge.

The transformation rules of feature points are as follows

- (CGPoint)transformPointoPortrait:(CGPoint)point {
    
    CGFloat width = frameWidth;
    CGFloat height = frameHeight;
    switch (self.sampleBufferOrientation) {
        case FaceDetectorSampleBufferOrientationCameraFrontAndHorizontallyMirror:
            return CGPointMake(point.y/height, point.x/width);
            
        case FaceDetectorSampleBufferOrientationCameraBack:
            return CGPointMake(1 - point.y/height, point.x/width);
            
        case FaceDetectorSampleBufferOrientationRatation90:
            return CGPointMake(1 - point.y/height, point.x/width);
            
        case FaceDetectorSampleBufferOrientationRatation180:
            return CGPointMake(1 - point.x/width, 1 - point.y/height);
            
        case FaceDetectorSampleBufferOrientationRatation270:
            return CGPointMake(point.y/height, 1 - point.x/width);
            
        case FaceDetectorSampleBufferOrientationNoRatation:
            returnCGPointMake(point.x/width, point.y/height); }}Copy the code

Euler Angle transformation

The three attributes of FaceInfo, pitch, yaw and roll, respectively represent the rotation Angle of the face around the X, Y and Z axes in the untransformed video frame.

model.pitchAngle = -faceInfo.pitch;
model.yawAngle = isFront ? -faceInfo.yaw : faceInfo.yaw;
model.rollAngle = isFront  ? -(M_PI/2 - faceInfo.roll) : (M_PI/2 - faceInfo.roll);
Copy the code