1. Low-dimensional to high-dimensional mapping
According to what we learned in the last video, the main thing we’re going to do is solve linear separable problems, and linear separable problems will eventually be converted to convex problems that are considered solvable. But not every problem is linearly separable. For linear inseparability problems, we can map from a lower dimension to a higher dimension. For example, mapping from two dimensions to three:
When the dimension M of the feature space rises, the dimension of the corresponding (ω, b) parameters to be estimated will also rise, and the degree of freedom of the whole model will also rise, so that there will be a greater probability to separate low-dimensional data. So the problem then goes from linearly indivisible to finding φ(x) to do the mapping from lower to higher dimensions.
2. The kernel function
In order to solve the problem of finding φ(x) above, a new concept is introduced: kernel function kernel function is a real number,φ(x) T,φ(x) is two vectors with the same dimension, and because φ(x)T is the transpose of φ(x), the inner product of two vectors with the same dimension will get a number.
The kernel function K and φ(x) are one-to-one correspondence, and the form of the kernel function cannot be arbitrarily chosen, but the following two conditions must be satisfied (this is a theorem, remember it first) : Mercer’s theorem:
3. Dual problem
The original problem:
Duality problem definition:
Theorem 1:
Duality gap: The difference between the primal problem and the dual problem is the duality gap
Strong duality theorem:If the objective function of the original problem is convex and the constraint is linear, then the solution of the original problem and the solution of the dual problem are the same
KKT conditions:
Conclusion:
1. Firstly, since linear divisibility cannot be achieved directly in many cases, the low-dimensional to high-dimensional mapping is used to solve the situation where status is linear and indivisible, and then linearly divisible is used to solve problem 2. The key of mapping from low dimension to high dimension is to find φ(x)Tφ(x), introduce kernel function K(x1,x2) to replace φ(x)Tφ(x), and then say that kernel function and φ(x)Tφ(x) are one-to-one corresponding, as long as you know one of them can be transformed into another form, and talk about Mercer’s theorem. 3. The duality problem, the original problem of the minimum conversion to the maximum, proved how the duality problem is derived, and extended the duality gap, strong duality theorem, KKT conditions and other concepts.