In machine learning, when we train a model, we may find that the model’s data is too different from the actual data. In this case, we usually choose to optimize our algorithm in one of the following ways.
-
[Fixed] Get more training sets
-
Reduce the number of features;
-
Try to add features;
-
Increasing polynomial;
-
Increasing the lambda;
-
Reduce the lambda.
These steps will usually take a lot of time, and choosing them aimlessly will probably not work.
In order to prevent this from happening, reduce your chances of getting mad, and maintain world peace, we need to use Machine Learning diagnostics to determine how to optimize our algorithms.
Evaluating a Hypothesis
We often encounter fitting problems when training machine learning models. But when is overfitting? And that’s what we need to figure out, and one way to do that is by drawing a picture of the function, like this one right here.
In fact, the function we fit often has many features, which makes it difficult to draw the function image. Another method is required — Evaluating a Hypothesis. How do you do that? Let’s take a look.
We have the following data set, which needs to be randomly divided into two classes on a 7:3 scale: training set and test set. The training set is used to train the model, and the test set is used to evaluate the accuracy of the model.
For linear regression, we use the cost function of the test set to evaluate.
For the classification problem of logistic regression, we can not only solve it by the cost function of test set, but also calculate it by the test error.
The way to think about this function is this. If the predicted result is inconsistent with the original result, the function value is 1, that is, error occurs; Otherwise the function value is 0. Finally, the test error is averaged to get the final result.
For machine learning, we may choose polynomials of various degrees as models. But how to determine the degree of the polynomial is a headache.
We use D to represent the number of times of selecting model polynomials:
After the selection, we will first use the data set to train the parameter set θ, calculate the corresponding cost function according to the parameter set θ, and select an optimal polynomial as the model after comparing the cost function.
At this point, our data set could not be divided according to the previous principle, and should be divided into three parts: training set (60%), cross validation set (20%) and test set (20%).
Firstly, we get the parameter set θ through the training set, then select the optimal polynomial model according to the cross validation set, and finally evaluate the hypothesis through the test set.
Ps. This article is based on the study notes of Ng’s machine learning course. If you want to learn machine learning together, you can follow the wechat public account “SuperFeng”, looking forward to meeting you.