In training machine learning models, the results are not well predicted usually because of high bias (underfitting) or high variance (overfitting). Draw the cost function of the cross validation set and the cost function of the test set in an image:

The red part on the left is high bias (underfitting) and the red part on the right is high variance (overfitting).


For the following model, the cost function of linear regression regularization is:

The blue box is the regularization term, and the larger λ is, the greater the punishment for the parameter is.


Training set cost functionJ_train (theta),Cross validation set cost functionJ_cv (theta)andTest set cost functionJ_test (theta)No regularization term is required, as follows:


For the regularization cost function, the parameter set θ is calculated by different λ values, and then the corresponding cross validation cost function J_cv (θ) and test cost function J_test (θ) are calculated.

Draw them in an image:

You’ll notice that as λ increases, J_test (θ) increases, which means the fit gets worse and worse. When λ starts to increase, J_cv (θ) decreases first, indicating that the regularization optimization of polynomial fitting is good and the generalization degree is good. But as λ increases, the polynomial fits the data worse and worse. From this image, you can figure out where the best fit is.



Ps. This article is based on the study notes of Ng’s machine learning course. If you want to learn machine learning together, you can follow the wechat public account “SuperFeng”, looking forward to meeting you.