1. Introduction to SVM
General framework of machine learning: training set => extracting feature vectors => Combining with certain algorithms (classifier: such as decision tree, KNN) => obtaining resultsDefinition of the SVMSupport Vector Machines (SVM) is a dichotomous model that maps the feature vectors of an instance to points in a space. The purpose of THE SVM is to draw a line that “best” distinguishes the two types of points, so that if a new point comes along in the future, the line will also do a good job of classification. SVM is suitable for small and medium-sized data samples, nonlinear, high-dimensional classification problems.SVM was first proposed by Vladimir N. Vapnik and Alexey Ya. Chervonenkis in 1963, and the current version (soft Margin) was proposed by Corinna Cortes and Vapnik in 1993. And it was published in 1995. Before the emergence of deep learning (2012), SVM was considered to be the most successful and best-performing algorithm in machine learning in recent decades.
2 Basic concepts of SVM Map the feature vectors of an instance (taking two-dimensional as an example) to some points in the space, such as solid points and hollow points in the figure below, which belong to two different categories. The purpose of SVM is to draw a line that “best” distinguishes the two types of points, so that if a new point comes along in the future, the line will also do a good job of classification.
Q1: How many lines can you draw to distinguish sample points? A: There are an infinite number of lines that can be drawn. The difference is how well they work. Each line can be called a partition hyperplane. For example, the green line above is not good, the blue line is ok, and the red line looks better. The line we hope to find that works best is a partition hyperplane with maximum spacing.
Q2: Why is it called a hyperplane? A: Because the characteristics of the sample are likely to be higher dimensional, the division of the sample space is not a line.
Q3: What are the criteria for drawing lines? / What makes this line work well? / What’s good? A: SVM will look for partition hyperplanes that can distinguish between the two categories and maximize the margin. When the sample is locally disturbed, it has the least influence on the hyperplane, the classification result is the most robust, and the generalization ability is the strongest for the unknown example.
Q4: What is margin? A: For any hyperplane, there is a minimum distance (vertical distance) between the data points on both sides of the hyperplane. The sum of these two minimum distances is the interval. For example, the strip area formed by two dotted lines in the figure below is the margin. The dotted line is determined by the two points closest to the central solid line (that is, determined by the support vector). But now the margin is relatively small. If we draw it in the second way, the margin is obviously larger and closer to our target.
Q5: Why make margin as big as possible? A: Because big margins are less likely to make mistakes, so they’re more robust.
Q6: What are support vectors? A: As can be seen from the figure above, all the points on the dotted line are the same distance from the partition hyperplane. In fact, only these points together determine the position of the hyperplane, so they are called “Support vectors”, hence the “Support vector machine”.
Two, some source code
3. Operation results
Matlab version and references
1 Matlab Version 2019
