A list,

Expression recognition can be done with a single sense or with multiple senses. It is the result of a combination of whole recognition and feature recognition. To be specific, the distance recognition is mainly the whole recognition, while in the near distance facial expression recognition, the feature recognition is more important. In addition, different parts of the face contribute to recognition in different ways, such as the eyes and mouth are more important than the nose. According to the study of human brain, facial expression recognition and face recognition although there is a connection, but the overall is separate, parallel processing process.

With the improvement of face processing technology (including face detection and face recognition), it is possible to analyze facial expression by computer. In general, facial expression analysis is a very difficult research direction, mainly reflected in the accuracy and effectiveness of facial expression feature extraction. Especially the latter, because the various expressions themselves reflected in the movement of each feature point is not very different, for example: open mouth does not mean smile, may also be crying and surprise.

At present, facial expression recognition technology is mainly applied in the fields of human-computer interaction, security, robot manufacturing, medical treatment, communication and automobile.

In 1971, psychologists Ekman and Friesen first proposed that human beings have six main emotions, and each emotion reflects a unique mental activity with a unique expression. These six emotions are called basic emotions and are composed of anger, happiness, sadness, surprise, disgust and fear.

Some of the methods described below are evolved from face recognition, combined with the characteristics of facial expression recognition.

At present, there are three recognition features: gray feature, motion feature and frequency feature. The gray feature is processed from the gray value of the expression image, and the basis of recognition can be obtained by using the different gray value of different expression. In this case, it is required that the image should be fully preprocessed for illumination, Angle and other factors, so that the gray value obtained has normalization. The motion feature is recognized by using the motion information of main facial expression points under different facial expressions. Frequency domain features mainly make use of the difference of expression images under different frequency decomposition, and its significant feature is fast speed.

In terms of specific facial expression recognition methods, there are three main directions: whole recognition method and local recognition method, deformation extraction method and motion extraction method, geometric feature method and facial feature method. In the whole recognition method, whether from the deformation of the face or from the movement of the face, it is to analyze the facial expression as a whole, to find the image differences under various expressions. Typical methods include: Principal Component Analysis (PCA) and Independent Component Analysis (ICA) based on feature face ICA, Fisher’s Linear Discriminants (FLD), Local Feature Analysis, LFA, Fisher Actions, Hidden Markov Model (HMM) and clustering analysis.

Local recognition method is to separate each part of the face in recognition, that is to say, the importance of each part is not the same. For example, in facial expression recognition, the most typical parts are eyes, mouth, eyebrows, etc., and the different movements of these places indicate a variety of facial expressions. Comparatively speaking, the movement of the nose is less, so that the nose can be analyzed as little as possible in recognition, which can speed up and improve accuracy. The most typical methods are Facial Actions Code System (FACS) and Facial movement parameter method in MPEG-4. Other methods include Local principal component analysis (PCA), Gabor wavelet and neural network. Facial Action Coding System (FACS) defines basic deformation Unit AU (Action Unit) according to the types and movement characteristics of Facial muscles. Various Facial expressions can be decomposed and corresponding to each AU to analyze Facial characteristic information. By analyzing AU changes in the face,

FACS has two main weaknesses: 1. Motion units are purely localized spatial templates; 2. No time description information, just a heuristic information

Deformation extraction method is based on the facial expression of various parts of the deformation of the situation to identify, the main methods are: Principal component analysis (PCA), Gabor wavelet, Active Shape Model (ASM) [7] and Point Distribution Model (PDM).

Motion method is based on the face in the expression of a variety of specific expressions of some specific features will make the corresponding movement of this principle to identify. In the six basic tables mentioned above, the movement direction or trend of some fixed feature points (or parts) on the face is fixed. For example, when people are afraid, the range of eyes should be larger than normal, the mouth should be generally open, etc. The specific situation is shown in Table 1. Typical recognition methods include Optical Flow [8] and Face Animation Parameter FAP in MPEG-4.

Geometric feature method is based on the shape and position of each part of the human face (including mouth, eyes, eyebrows, nose) to extract the feature vector, the feature vector to represent the geometric features of the face. Depending on the feature vector, different expressions can be identified. The important method is principal component analysis based on motion unit (AU). In the feature method, the whole face or local face is filtered through the image to get the feature vector. A common filter is Gabor wavelet.

Of course, these three development directions are not strictly independent. They only extract the required facial features from different sides, and all provide a way of thinking to analyze facial expressions. They are interrelated and interact with each other. There are many ways to fall in between the two or all three. For example, the facial movement coding system method is a kind of local method, but also from the face movement and so on.

The process and method of facial expression recognition

1. The establishment of facial expression library

At present, emoticons commonly used in research mainly include:

Cohn-kanade Au-Coded Facial Expression Image Database(CKACFEID) was jointly established by CMU Robotics Research Institute and Department of Psychology.

The Japanese Women’s Facial Expressions Database (JAFFE), established by ATR in Japan, is an important test library for studying Asian facial expressions

Fer2013 Face dataset, which can be downloaded from the Kaggle website

More libraries – > Reference links

2. Facial expression recognition

(1) Image acquisition: obtain static images or dynamic image sequences through image capture tools such as cameras.

(2) image pretreatment: normalization of image size and gray scale, correction of head posture, image segmentation, etc.

Objective: To improve image quality, eliminate noise, unify image gray value and size, and lay a good foundation for post-sequence feature extraction and classification recognition

Main work: segmentation of facial expression recognition sub-regions and normalized processing of facial expression images (normalized scale and gray scale)

(3) Feature extraction: transform the lattice into higher-level image expression, such as shape, motion, color, texture, spatial structure, etc., and reduce the dimension of huge image data on the premise of ensuring stability and recognition rate as much as possible.

The main methods of feature extraction include geometric feature extraction, statistical feature extraction, frequency domain feature extraction and motion feature extraction

1) Feature extraction using geometric features is mainly to locate and measure the significant features of facial expressions, such as the position changes of eyes, eyebrows and mouth, to determine their size, distance, shape and mutual proportion and other features for facial expression recognition

Advantages: Reduced input data

Disadvantages: some important identification and classification information is lost, and the accuracy of the results is not high

2) The method based on the overall statistical feature mainly focuses on preserving as much information as possible in the original facial expression image, and allows classifiers to find relevant features in the facial expression image, and obtain features for recognition by transforming the whole facial expression image.

Main methods: PCA (principal component analysis) and ICA (independent principal component Analysis)

PCA uses an orthogonal dimension space to explain the main direction of data change. Advantages: Good reconstructibility. Disadvantages: poor separability

ICA can obtain independent components of data and has good separability

Disadvantages of the extraction method based on the overall statistical features of images: the interference of external factors (lighting, Angle, complex background, etc.) will lead to the decline of recognition rate

3) Feature extraction based on frequency domain: the image is converted from the spatial domain to the frequency domain to extract its features (features at lower levels)

Main method: Gabor wavelet transform

Wavelet transform can define different nuclear frequency, bandwidth, and direction of image multi-resolution analysis, can effectively extract the image feature of different direction of different levels of detail and relatively stable, but as the characteristics of the low level, not directly used for matching and identification, combined with ANN and SVM classifier, enhance the accuracy in expression recognition.

4) Extraction based on motion features: Extraction of motion features of dynamic image sequences (focus of future research)

Main method: optical flow method

Optical flow refers to the apparent motion caused by the brightness mode, which is the projection of the three-dimensional velocity vector of visible points in the scene on the imaging plane. It represents the instantaneous change of the position of points on the scene surface in the image. Meanwhile, the optical flow field carries rich information about motion and structure

Optical flow model is an effective method to process moving images. Its basic idea is to take the moving image function F (x, y,t) as the basic function, establish the optical flow constraint equation according to the image intensity conservation principle, and calculate the motion parameters by solving the constraint equation.

Advantages: reflect the essence of facial expression changes, less affected by illumination imbalance

Disadvantages: large amount of calculation

4) Classification discrimination: including design and classification decision

In the stage of classifier design and selection of facial expression recognition, there are mainly the following methods: linear classifier, neural network classifier, support vector machine, hidden Markov model and other classification and recognition methods

5.1) Linear classifier: It is assumed that the pattern Spaces of different categories are linearly separable, and the main reason for separability is the difference between different expressions.

5.2) Neural Network Classifier: Artificial Neural Network (ANN) is a Network structure that simulates human brain neurons. It is an adaptive nonlinear dynamic system connected by a large number of simple basic components — neurons. With the coordinate position of face feature and its corresponding gray value as the input of neural network, ANN can provide incredibly complex interface between classes.

Neural network classifiers mainly include multilayer perceptron, BP network and RBF network

Disadvantages: need a large number of training samples and training time, can not meet the requirements of real-time processing

5.3) Support vector machine (SVM) classification algorithm: strong generalization ability, solving small sample, nonlinear and high-dimensional pattern recognition problems table, new research hotspot

Basic idea: For nonlinear separable samples, the input space is transformed into a high-dimensional space by nonlinear transformation, and then the optimal linear interface is obtained in the new space. This kind of nonlinear transformation is realized by defining appropriate inner product function, three kinds of inner product function are commonly used: polynomial inner product function, radial basis inner product function, Sigmoid inner product function

5.4) Hidden Markov Models (HMM) : Features: Statistical model, robust mathematical structure, suitable for dynamic process time series modeling, with strong pattern classification ability, theoretically can deal with arbitrary length of time series, a very wide range of applications.

Advantages: HMM method can accurately describe the changing nature and dynamic performance of facial expressions

5.5) Other methods:

The face image is modeled as a deformable 3D grid surface based on the physical face model recognition method, and the space and gray level are considered in a 3D space.

The method of model – based image coding uses genetic algorithm to encode, recognize and synthesize different expressions

4. Research prospects

(1) Robustness needs to be improved:

External factors (mainly head deflection and light change interference)

Using multi-camera technology and color compensation technology to solve the problem has some effect, but is not ideal

(2) The computation amount of facial expression recognition needs to be reduced

(3) To strengthen the integration of multi-information technology

Facial expression is not the only way to express emotions. It will be a problem to be considered by facial expression recognition technology to predict people’s inner emotions more accurately by integrating various information such as voice and intonation, pulse and body temperature

Attached to the present stage of the specific facial expression recognition method (in fact, it can be seen from here, basically handcrafted Features + shallow classifier)





Ii. Source code

 
% code optimized for the following assumptions:
% 1. Only one face in scene and it is the primary object
% 2. Faster noise reducion and face detection
 
% Originaly by Tolga Birdal
% Implementation of the paper:
% "A simple and accurate face detection algorithm in complex background"
% by Yu-Tang Pai, Shanq-Jang Ruan, Mon-Chau Shie, Yi-Chi Liu
 
% Additions by Tolga Birdal:
%  Minimum face size constraint
%  Adaptive theta thresholding (Theta is thresholded by mean2(theata)/4
%  Parameters are modified by to detect better. Please check the paper for
%  parameters they propose.
% Check the paper for more details.
 
% usage:
%  I=double(imread('c:\Data\girl1.jpg'));
%  detect_face(I);
% The function will display the bounding box if a face is found.
 
 
function [aa,SN_fill,FaceDat]=detect_face(I)
 
close all;
I=imread('./Test/Image029.jpg');
% No faces at the beginning
Faces=[];
numFaceFound=0;
 
I=double(I);
 
H=size(I,1);
W=size(I,2);
 
%%%%%%%%%%%%%%%%%% LIGHTING COMPENSATION %%%%%%%%%%%%%%%
 
C=255*imadjust(I/255[0.3;1], [0;1]);
 
figure,imshow(C/255);
% title('Lighting compensation');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%% EXTRACT SKIN %%%%%%%%%%%%%%%%%%%%%%
YCbCr=rgb2ycbcr(C);
Cr=YCbCr(:,:,3);
 
S=zeros(H,W);
[SkinIndexRow,SkinIndexCol] =find(10<Cr & Cr<255);
for i=1:length(SkinIndexRow)
    S(SkinIndexRow(i),SkinIndexCol(i))=1;
end
 
m_S = size(S);
S(m_S(1)7 -:m_S(1), :) =0; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%% REMOVE NOISE %%%% % figure; imshow(S); SN=zeros(H,W);for i=1:H- 5
    for j=1:W- 5
        localSum=sum(sum(S(i:i+4, j:j+4)));
        SN(i:i+5, j:j+5)=(localSum>20); end end % figure; imshow(SN); Iedge=edge(uint8(SN)); % figure; imshow(Iedge); SE = strel('square'.9); SN_edge = (imdilate(Iedge,SE)); % % SN_edge = SN_edge1.*SN; % figure; imshow(SN_edge); SN_fill = imfill(SN_edge,'holes'); figure; imshow(SN_fill); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%% FIND SKIN COLOR BLOCKS %%%% [L,lenRegions] = bwlabel(SN_fill,4);
AllDat  = regionprops(L,'BoundingBox'.'FilledArea');
AreaDat = cat(1, AllDat.FilledArea);
[maxArea, maxAreaInd] = max(AreaDat);
 
FaceDat = AllDat(maxAreaInd);
FaceBB = [FaceDat.BoundingBox(1),FaceDat.BoundingBox(2),...
    FaceDat.BoundingBox(3)- 1,FaceDat.BoundingBox(4)- 1];
 
aa=imcrop(rgb2gray(uint8(I)).*uint8(SN_fill),FaceBB);
 
 figure,imshow(aa);
 title('Identified Face');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
end
Copy the code

3. Operation results



Fourth, note

2014a