1. Overview of Haar feature principle

Haar feature is a feature that reflects the change of gray level of image and calculates the difference value of pixel module. It can be divided into three categories: edge feature, linear feature, center feature and diagonal feature. In the feature template, black rectangle pixels and minus white rectangle pixels are used to represent the eigenvalues of the template. For example, some features of the face can be simply described by the rectangular modular difference features: the eyes are darker than the cheeks, the sides of the nose are darker than the sides, the mouth is darker than the sides, etc. However, the rectangular feature is only sensitive to some simple graphic structures, such as edges and line segments, so it can only describe the image structure with obvious gradient changes of pixel modules in specific directions (horizontal, vertical and diagonal).

For example, in the figure above, the Haar features of the images of modules A, B and D are: V =Sum white-sum Black Module Haar features of the images of modules C are: V =Sum white (left)+Sum white (right)-2*Sum black To ensure that the number of pixels in the white rectangular module is the same as that in the black rectangular module, so multiply by 2

For an image, a large number of features can be enumerated to represent an image by changing the size and position of the feature template. The feature template in the figure above is called “feature prototype”. The feature prototype is called “rectangular feature” when it is extended in the image sub-window. The values of rectangular features are called “eigenvalues”. For instance in the size of 24 * 24 images can begin with coordinates (0, 0) width of 20 high calculation above A characteristics of 20 rectangular template, can also start with coordinates (0, 2) width of 20 high calculation above A characteristics of 20 rectangular template, can also start with coordinates (0, 0) wide for 22 high calculation above A characteristics of 22 rectangular template, In this way, the rectangular eigenvalues change with the category, size and position, so that a very small image contains a lot of rectangular features. Rectangle eigenvalues are functions of the template category, rectangle position, and rectangle size.

AdaBoost classifier

AdaBoost is a typical Boosting algorithm, which belongs to Boosting family. Boosting algorithm before we talk about AdaBoost, we’ll talk about Boosting algorithm. Boosting algorithm is a process of Boosting “weak learning algorithm” to “strong learning algorithm”. Its main idea is that “two heads are better than one”. In general, it is relatively easy to find a weak learning algorithm, and then to get a series of weak classifiers through repeated learning, and combine these weak classifiers to get a strong classifier. Boosting algorithm involves two parts, addition model and forward step algorithm. The addition model means that the strong classifier is a linear sum of a series of weak classifiers. The general combination form is as follows:

Among them, the h (x; Am is the optimal parameter learned by the weak classifier; βm is the proportion of weak learning in the strong classifier; P is the combination of all AM and β M. These weak classifiers add up linearly to form a strong classifier.

Forward step means that in the training process, the classifier generated in the next iteration is trained on the basis of the previous one. That is, it can be written like this:

Boosting algorithm has different types due to different loss functions, and AdaBoost is Boosting algorithm with exponential loss function.

Boosting AdaBoost Fundamental Boosting AdaBoost Fundamental

Weak learning at each iteration h(x; Am) how is it different, how do you learn? How to determine the weight βm of weak classifier? For the first question, AdaBoost changed the weights of training data, that is, the probability distribution of samples. The idea is to focus on the samples incorrectly classified, reduce the weights of the samples correctly classified in the last round, and improve the weights of the samples incorrectly classified. Then, it learns from some basic machine learning algorithms adopted, such as logistic regression.

For the second question, AdaBoost adopted the weighted majority voting method to increase the weight of the weak classifier with small classification error rate and reduce the weight of the weak classifier with large classification error rate. This is easy to understand, the high accuracy of the weak classifier in the strong classifier should certainly have a greater say.

Instance to explain

To understand this, let’s take an example.

There are the following training samples, and we need to construct a strong classifier to classify them. X is the feature, y is the tag.

The serial number 0 1 2 3 4 5 6 7 8 9 10
x 0 1 2 3 4 5 6 7 8 9 10
y 1 1 1 – 1 – 1 – 1 1 1 1 – 1

Let weight distribution D1=(w1,1,w1,2… W1, 10)

And assume that the initial weight distribution is uniform: w1, I =0.1, I =1,2… 10,

Now it’s time to train the first weak classifier. We found that the classification error rate was the lowest when the threshold value was 2.5, and the weak classifier was obtained as follows:

Of course, other weak classifiers can also be used, as long as the error rate is the lowest. I’m using a piecewise function for convenience. The classification error rate E1 =0.3 is obtained.

The second step is to calculate the coefficient of G1(x) in the strong classifier

I’m going to put this formula here, and I’m going to derive it.

The third step is to update the weight distribution of samples for the next round of iterative training. By the formula:

The new weight distribution was obtained, which changed from 0.1 to:

D2 = (0.0715, 0.0715, 0.0715, 0.0715, 0.0715, 0.0715, 0.1666, 0.1666, 0.1666, 0.0715)

It can be seen that the weight of correctly classified samples decreases, while the weight of incorrectly classified samples increases.

The fourth step is to get the strong classifier of the first iteration:

And so on, after the second round… In the NTH round, iterations are repeated until the final strong classifier is obtained. The scope of iteration can be defined by itself, such as limiting the convergence threshold and stopping the iteration if the classification error rate is less than a certain value, or limiting the number of iterations and stopping the iteration after 1000 times. Here, the data is simple, and in the third iteration, a strong classifier is obtained:

The classification error rate of is 0, ending the iteration.

F(x)=sign(F3(x)) is the final strong classifier.

Algorithm process

To sum up, the algorithm flow of AdaBoost can be obtained:

Input: Training data set

Among them,

1. Weight distribution of initialization training samples:

2. For m = 1, 2,… ,M (a) uses the training data set with weight distribution Dm to learn, and obtains the weak classifier Gm(x) (b) to calculate the classification error rate of Gm(x) on the training data set:

(c) Calculate the weight of Gm(x) in the strong classifier:

(d) Update the weight distribution of the training data set (here, ZM is the normalization factor, in order to make the sum of the probability distribution of the samples 1) :

3. Get the final classifier:

The formulas

Now let’s figure out where this formula comes from.

It is assumed that Fm−1(x) is obtained after m−1 iteration. Based on the forward step, we can get:

We already know that AdaBoost uses exponential loss, from which the loss function can be obtained:

At this point, Fm−1(x) is known and can be moved to the front as a constant:

Among them,

Don’t you think it’s not close enough? So let’s simplify this a little bit:

Is that enough now? Ok, let’s continue to simplify Loss:

After formula deformation, stir – fried chicken excited!

Ok, so we have our reduced loss function. And then we take the derivative.

Take the partial derivative of alpha m, let’s

Get:

AdaBoost analysis the original link: https://www.cnblogs.com/ScorpioLu/p/8295990.html

Three, the instance,

detectMultiScale()

parameter

  • InputArray image, images to be detected, generally grayscale images to speed up detection;
  • STD ::vector< Rect > & Objects, the rectangular box vector of the object being detected; Is the output, such as face detection matrix Mat
  • Double scaleFactor, represents the scale coefficient of the search window in two successive scans. The default is 1.1, which means that each search window is expanded by 10%. Generally, the value is 1.1
  • Int minNeighbors: indicates the minimum number of adjacent rectangles that make up the detection target (default: 3). If the number of small rectangles that make up the detection target is smaller than min_neighbors – 1, both are excluded. If min_neighbors is 0, it returns all candidate rectangles without doing anything. This setting is used in user-defined combinators for detection results.
  • Int flags, either use the default value or CV_HAAR_DO_CANNY_PRUNING. If CV_HAAR_DO_CANNY_PRUNING is set to CV_HAAR_DO_CANNY_PRUNING, the function will use Canny edge detection to exclude areas with too many or too few edges. So these areas are not usually where the face is;
  • Size minSize, used to limit the range of the target area obtained.
  • Size maxSize, the maximum Size used to limit the range of the target area obtained.
import cv2
import numpy as np
face_xml = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
eye_xml = cv2.CascadeClassifier('haarcascade_eye.xml')
# load file
# file source: https://juejin.cn/post/6844903607335124999
img = cv2.imread('timg.jpg')
# loading images
cv2.imshow('src',img)
# Print images
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Graying
faces = face_xml.detectMultiScale(gray,1.3.5)
# Detect the face of the picture
print('face=',len(faces))
# Print how many faces are detected

for (x,y,w,h) in faces:
# Draw every face
    cv2.rectangle(img,(x,y),(x+w,y+h),(255.0.0),2)
    Draw a rectangle for your face
    roi_face = gray[y:y+h,x:x+w]
    roi_color = img[y:y+h,x:x+w]
    # Face recognition
    eyes = eye_xml.detectMultiScale(roi_face)
    # Recognize your eyes
    print('eye=',len(eyes))
    # Print how many eyes were detected
    for (e_x,e_y,e_w,e_h) in eyes:
        cv2.rectangle(roi_color,(e_x,e_y),(e_x+e_w,e_y+e_h),(0.255.0),2)
        # Draw a rectangle for the eye
cv2.imshow('dst',img)
# Print the picture after drawing
cv2.waitKey(0)
Copy the code

The original:

face= 4
eye= 2
eye= 2
eye= 1
eye= 2
Copy the code