This is the third day of my participation in Gwen Challenge.

This article is the machine learning course notes of Li Hongyi, course portal 👇. 【 Machine Learning 2021】 Forecast the number of viewers (part 2) – Introduction to basic concepts of deep learning

Model Bias Indicates Model Bias

Linear’s Model may be too simple. For Linear’s Model, the relationship between x1 and y is a straight line, and as x1 gets higher and higher, y should get bigger and bigger. You can change the slope of this line with different w’s, you can change the b’s and change where this blue line intersects with the y axis, but no matter how you change w and B it’s always going to be a straight line, it’s always going to be a straight line, the bigger x1 is the bigger y is, the more people watch it the day before, the more people watch it the next day.

But maybe the reality is not like this, the red line below, maybe when x1 is less than a certain value, the number of viewers on the previous day is proportional to the number of viewers on the next day, when X1 is greater than a peak value, the number of viewers on the previous day is too high, the number of viewers on the next day will be less. Linear’s Model can never be used to make the red line, so Linear’s Model obviously has a significant limitation, a limitation that comes from the Model called Model Bias.

Piecewise Linear Curves

Model Bias is different from the Bias of B mentioned above. It means that Model cannot simulate the real situation. We need to write a more complex and elastic Function with unknown parameters.

If you look at the red curve, it can be viewed as a constant plus a bunch of blue functions. The blue Function has the following characteristics: when the X-axis value is less than a certain threshold, it is a certain fixed value; When the value is greater than another threshold, it is another constant value. There’s a ramp in the middle. So it’s horizontal, then it slopes, then it’s horizontal.

Note: The line marked 1 corresponds to the first segment of the red curve. The slope starts at the first segment of the red curve and ends at the second segment of the red curve. They have the same slope. You can draw the second and third lines in the same way, and then 0 + 1 + 2 + 3 gives you the full red curve.

The Piecewise Linear Curves are called the Piecewise Linear Curves, which look like the red curve. So there’s a way to combine it with constant terms plus a bunch of blue functions, and the more turns there are, the more blue functions you need.

To go to the extreme, it’s a curve like the one shown below. We can take points on it and connect them to form a Piecewise Linear Curve. If we take enough points, we’ll have enough blue functions that it will approach the curve indefinitely.

So we can see that the relationship between x and y may be very complicated, but we can try to write a Function with an unknown quantity. This Function represents a bunch of blue functions plus a Constant.

Sigmoid Function

So how do you write this Function? It’s not that easy to write it down directly, but we can approximate it with a curve, which we call the Sigmoid Function.

Note: when the input value of x1 approaches infinity, the e term will disappear, and this side will converge to the height of C; As x1 goes to minus infinity, the denominator is going to be very large, and y is going to go to zero; So you can use a curve drawn by a Function like this to approximate the blue Function.

By adjusting B, W and C, we can make various shapes of Sigmoid Function, and then use different shapes of Sigmoid Function to approximate the blue Function. Stacking them together makes it possible to approximate various Piecewise Linear Curves.

So we end up with something like this down here, which is a very elastic Function with unknown parameters. The red underlined part is our original blue Function.

The term collection

English Chinese
Model Bias The model deviation
threshold A threshold
Piecewise Linear Curves Piecewise linear curve
Sigmoid Function The sigmoid function