This article has participated in the activity of "New person creation Ceremony", and started the road of digging gold creation together
In a previous blog, we discussed ** Logistic Regression models ** to solve classification problems. However, we find that the logistic regression model solves the dichotomy problem, that is, the result of the model has only two values, y=0 or y=1. But in the real situation, our training set often contain multiple classes (> 2), we cannot use a binary variable (y = 0 | y = 1) to do judgment. For example, we predict the weather. The weather is divided into sunny days, cloudy days, rainy days, cloudy days, snowy days, foggy days, etc.
Here is a ** Multiclass Classification problem ** possible situation:
Three different shapes, representing three different categories.
One way to solve these problems is to use the one-vs-all approach. In the one-to-many method, we transform the multi-class classification problem into a binary classification problem. To achieve this transformation, we mark one of the multiple classes as positive (y=1) and all the others as negative (y=0). The model is denoted as:
Then, similarly, we choose another class to mark as positive class (y=2), and then mark all the other classes as negative class, and write this model as:
And so on.
Finally, we have a series of models, abbreviated as:
Where I = 1,2,3… ,k
The steps can be denoted as follows:
Finally, when we need to make a prediction, we run all the classifiers and choose the most likely output variable for each input variable.
This is a one-to-many method to solve the problem of multi-class classification.
Next time, we discuss Regularization ** for the training set data fitting problem.