This article has participated in the “Digitalstar Project” and won a creative gift package to challenge the creative incentive money. Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.
An overview of
Support vector machine (SVM, also known as support vector network) is one of the most popular algorithms in machine learning. It is derived from statistical learning theory and is the first strong learner we have dealt with other than integration algorithms. In terms of practical application, SVM performs very well in various practical problems. It is widely used in handwriting recognition, digital recognition and face recognition. It plays an important role in text and hypertext classification, because SVM can greatly reduce the need for marking training examples in standard inductive and transductive Settings. At the same time, SVM is also used to perform image classification and image segmentation system. The experimental results show that SVM can achieve much higher search accuracy than query refinement schemes after only three or four rounds of correlation feedback. In addition, biology and many other sciences are favored by SVM, which is now widely used for protein classification, with an average accuracy of more than 90% for compound classification in the industry. In cutting-edge research in biological sciences, support vector machines are also used to identify the characteristics used for model prediction to find out the factors influencing the results of gene expression.
1.1 Support vector machine classifiers are working
What support vector machines do is actually quite easy to understand. Let’s take a look at the distribution of two labels, one represented by a circle and the other by a square. The classification method of support vector machine is to find a hyperplane in the distribution as the decision boundary, so that the classification error of the model on the data is as close as possible, especially on the unknown data set classification error (generalization error) as small as possible.
Two hyperplane
- In geometry, a hyperplane is a subspace of a space that has one dimension smaller than the space in which it resides. If the data space itself is three-dimensional, its hyperplane is two-dimensional, whereas if the data space itself is two-dimensional, its hyperplane is a one-dimensional straight line.
- In the dichotomy problem, a hyperplane is said to be a “decision boundary” of the data if it can divide the data into two sets, each of which contains a separate category.
All points on one side of the decision boundary are classified as belonging to one class, while all points on the other side are classified as belonging to another class. If we can find the decision boundary, the classification problem can be changed to explore the relative position of each sample with respect to the decision boundary. For example, in the data distribution above, we could easily draw a line between a square and a circle, and have all the samples falling to the left of the line be classified as squares and those falling to the right as circles. If we take the data as our training set, as long as there is only one type of data on one side of the line, there is no classification error, and our training error will be zero.