Bayes again?

I talked a little bit about the naive Bayes algorithm earlier in the article, and going back to that simple example, if you saw a black man on the street and he was tall, you would probably guess nine times out of ten that he was from Africa.

Why is that? Because in the absence of any other information available, in general most Africans fit this profile, so you would pick the highest probability of being African, and that idea is Called Bayesian thinking.

It’s a famous formula

If you’ve ever taken probability theory, this is known as Bayes’ theorem, and it was invented by Thomas Bayes. The formula basically says, well, now there’s an event A, and we have our own subjective judgment about the probability of event A, which is P(A). After the event B occurs, we recognize that the probability of event A occurs due to the occurrence of event B led to the changes, after A change in the probability of the event for P (A | B).

Let’s take a simple example

To put it simply, the posterior probability that you wake up late is:

Write it formally:

Bayesian inference

Bayesian thinking in updating probability is to constantly get new evidence to update their beliefs, such a probabilistic thinking mode is called Bayesian inference.

Suppose we have a code that has just been written but has not been tested. At the beginning, we think that the probability of compiling the code bug we wrote should be 80%. If the compilation error occurs, the actual probability is lower than 80%, and the probability is constantly updated by relying on data (belief).

In fact, we update our previous beliefs as new evidence emerges, but we rarely reach an absolute judgment until all other possibilities have been ruled out.

Bayes Dimension

Frequency sent

For frequentists, probability is the frequency with which events occur over a long period of time. If a fair coin is tossed 3 times at random, and all 3 flips are up, then the probability of getting up is 100%. But that’s obviously not the case. A fair coin has a 50/50 chance of flipping heads and tails each time. In fact, this is influenced by the data sample. If the data is large enough, the probability can usually be verified by frequency verification, namely the law of large numbers.

Bayesian pie

For Bayesians, probability is confidence that events will happen. For an event that has already occurred, such as a coin toss, you can try multiple times to reach a conclusion, at which frequency you can use. However, for the new presidential election, the frequency cannot be solved, but the prior probability can be used to indicate people’s confidence in the candidate to become president.

Prior probability and posterior probability

The prior probability is our belief that event A will happen, called P(A).

Posterior probability is as evidence of new data (iteration) need to be in the event A change, can be increased, reduced or the same, according to the evidence at this time after the update of event A belief called P (A | X).

Add “evidence”

As we add more evidence, the original belief is washed away until your subjective belief of the event becomes closer to your objective belief.

Let N represent the amount of evidence we have, and if N goes to infinity, then Bayesian results are generally consistent with frequentist results.

For small N, by introducing a prior probability and returning a probabilistic result (rather than a fixed value), its uncertainty is preserved, which is the reaction of the instability of small data sets.

Andrew Gelman said that the sample is never big enough, and if the N size is not enough for a precise enough estimate, you need to get more data. But when N gets big enough, you start dividing smaller data sets into deeper questions.