Why Machine Learning Works (PART 1)

1. Do you dare to invest in machine learning?

By the end of this series of articles, we have understood the concept of machine Learning and grasped a specific Learning Algorithm. It seems that the door to machine learning has been opened, to understand deeper models, learn more techniques, smooth sailing.

But in a moment, let’s review the machine learning concept we’ve learned so far: picking an optimal Hypothesis from a Hypothesis Set based on training dataAs a learned model.

Has anyone in the process of learning ever had such a doubt:

Can the model, learned from training data, known as historical logs, really perform as well on future predictions?

If you remember the “Ground Truth” in the upper left corner of Figure 1, the perfect model that only God knows, I would like to ask:

We learned about the model
Really like
As perfect a predictor of the future?

Let’s be more specific. In the stock market, we have learned a model that performs very well based on historical data. Do you dare to invest real money according to the prediction of the future based on the model?

Have you ever been unsure of yourself for a moment or two, as the saying goes, “We can learn from history to know the rise and fall”, and whether the rules drawn from history can really be relied on to predict the future? In order to be a machine learning expert responsible for the future, we will try to answer the question in the next two parts starting with this one, namely

Are models really reliable about the future?

Figure 1:

No one can predict the future

Don’t be silly. No one can predict the future.

— From the Future

Let’s start with the following example.

Who is also don’t know where to get to, so a few data into table 1 shown in the first three columns of sample number (| | | | annotation), hope we find out the characteristics and annotation.

Therefore, bajig, PI Li, Wang Kangmei and Wolpert, four algorithm engineers, studied this problem and finally got four models.

It is said that the eight Tide uses PLA learning algorithm to learn the model. He put theThe approaching Ground Truth is denoted as.
It is said that PI li uses reinforcement learning algorithm to learn the model. He put theThe approaching Ground Truth is denoted as.
It is said that Wang kangmei used a deep learning algorithm to learn the model. He put theThe approaching Ground Truth is denoted as.
Wolpert didn’t reveal his algorithm, but he got the model. He put theThe approaching Ground Truth is denoted as.

How are you doing with these four models? Let’s look at the last four columns of Table 1, which record the output of the model. By comparison with the sample annotations in the third column, it can be seen that the output of the model is completely consistent with the sample annotations in the training data, and the models of the four participants all perform quite well.

But we also know that being good at training data is not really good, and ultimately the model wants to make reliable predictions about unknown data, and good predictions are really good. Therefore, we found two data in Table 2, whose labels were unknown, and asked the model of four people to predict what the labels would be.

The result is shown in the last four columns of Table 2, where tricky things happen! The models of four people gave completely different predictions.

Octagonal model: The predicted results are 1 and −1.
PI model: The predicted results are −1 and −1.
Wang kangmei’s model: The predicted results are 1 and 1.
Wolpert’s model: The predicted outcomes are −1 and 1.

This is confusing. Four people using four different algorithms have all learned models that perform perfectly on the training data, but the predictions are completely different.

So whose algorithm should we trust? Whose prediction is accurate?

I’m sorry to have brought you so far, but here’s a potentially subversive conclusion:

There’s a very famous one in machine learning
No Free Lunch (NFL) theorem
It tells us that in the case described in this example, the expectation level of these algorithms is the same.

What’s more, Wolpert said at this point that he didn’t use any machine learning algorithms at all, that his predictions were made by guesswork. If he’s right, it means that the No Free Lunch theorem tells us that the models produced by these sophisticated algorithms are on a par with guesswork.

How is that possible? This person is lying!!

He’s not lying. Wolpert, whose real name is David H. Wolpert, is the scientist who proposed and proved the No Free Lunch theorem. “No Free Lunch,” or “there is No such thing as a Free Lunch,” is a brick on the road to learning for all machine learning experts.

3. World view remodeling after No Free Lunch

No Free Lunch, the unappetizing Lunch that destroyed our belief in machine learning. Learning algorithms are almost like guesswork! So machine learning is impossible, right? How can you trust a guesstimate algorithm to invest?

Is there something missing? !

Yes, it does leave out a very important premise. Let’s flash back to the passage in the previous section

“There is a very famous No Free Lunch theorem in machine learning that tells us that in the case described in this example, the expectation level of these algorithms is the same.”

Notice the sentence: “In the case of this example”! That is to say, the NFL theorem has an important premise, which only leads to sophisticated machine learning algorithms that are little more than guesswork in situations like the one described above, and which are not satisfied in many specific practical applications.

So what exactly does “the situation described in the above example” mean? Remember it well:

The NFL theorem is assumed in the elaboration
That is, all potential possibilities are equally likely to occur.

Specifically, in the above example, Ground Truth of training data may be generated in the case of this batch of training data from unknown sourcesThere are many, but because there is no specific scene, resulting in this manyThe probability of Ground Truth actually producing this batch of training data is the same.

And whichever learning algorithm must eventually favor one of them, so ultimately the expectation level of these algorithms is the same. This tendency of the learning algorithm is called the learning algorithmInductive bias, hereinafter referred to as”preferences“).

As we will see later, in a specific real-world scenario, someThe probability of something happening is highProbability is low, and some probability is not at all. They are not equally likely, so they no longer satisfy the premise of the NFL theorem. Take, for example, an opaque piggy bank with hundreds of coins in it. You pick up a handful at random and all you catch are $1 coins. If you were to predict what would happen the next time, it would be obvious that the probability of getting all of them a dime is going to be much lower than the probability of getting all of them a dollar.

The black pot of this time, between the truth of God that we can’t see, the world has tilted.

Therefore, in specific practical problems, those algorithms that match inductive preference with the problem itself can achieve good results, so that alternative learning is feasible.

And to conclude, we went to all this trouble, we went all the way around the montage, just to try to get you to understand the philosophy behind the NFL theorem, which is:

It makes no sense to talk about machine learning algorithms outside the context of specific problems.

If someone claims that XX learning algorithm is better than YY learning algorithm, most likely not a liar, is bad.

4. Previews and others

Due to the limited spare time and energy, I could not finish the topic — “Why Machine learning is feasible” in one week, so I had to divide it into two parts. This series of writing into the fourth or even fifth, has been beyond my initial imagination. Thanks to all the readers who encouraged me. Writing one article every week is a test for me but also a lot of harvest. I hope I can keep it up and I hope this series of articles can bring you a little value.

Thank you again for reading. This is The Machine Learning Book for Everyone. I’m Baxing. If you’d like to receive updates on future articles, consider following me. Or follow this column of the same name and it will be updated in your notification center.

Have fun 🙂

1. Do you dare to invest in machine learning?

No one can predict the future

3. World view remodeling after No Free Lunch

4. Previews and others

Related Posts

Flink a environment build, output wordCount

Summary of changes to the TensorFlow 1.0 API

LeetCode 13 questions Roman numerals to integers