Explain the confusing Poisson distribution with a chestnut

This article originated from personal public account: TechFlow, original is not easy, for attention

Today is the fifth article on probability and statistics, and this article means that we have come to the end of higher mathematics. The rest of the high number of content there are many, such as multiple integration, differential equation solution and so on. But for the algorithmic world, basic calculus is pretty much enough, so we won’t go any further, and we’ll do a list of articles if we have something to do with it.

Our article is about poisson distributions in statistics.

Take a chestnut

Poisson distribution is very important in probability statistics and can be conveniently used to calculate some probabilities that are difficult to calculate. Many books will tell you that the Poisson distribution is essentially a binomial distribution, but it’s just a way to simplify the calculation of the binomial distribution. Conceptually, this is true, but it’s hard for us beginners to fully understand.

So let’s take an example, just to understand it in a general way.

Suppose we have a chestnut tree, and sometimes chestnuts may fall from the tree because of wind or small animal activity. Chestnuts falling from the tree is obviously an accidental event, and the probability of occurrence is very low. How can we find the probability distribution? Poisson distribution solves such a problem.

It seems that there is no model that can directly describe this problem, it has to go through some transformation.

And we can actually cut things up, turn this into a binomial distribution problem.

Let’s say we divide the day into parts, so that for each part of the day, whether or not a chestnut will fall is a matter of whether or not it will happen. So this becomes a binomial distribution problem. Theoretically, no two chestnuts fall at exactly the same time, so as long as we slice the time finely enough, we can guarantee that at most one chestnut will fall in any given period of time (otherwise the binomial distribution would not be satisfied).

Suppose we divide the time of the day into n parts, and we want to know the probability that k chestnuts will fall during the day. According to the formula of the binomial distribution, the probability is:

Here, we take a solid step forward, and write down the expression for probability.

Derive poisson distribution

We have this formula, but it doesn’t seem to help us, because all we know is that p is the probability of chestnuts falling per unit time, and how do we know what that probability is? Do you really measure it?

To solve this problem, we have to go back to the binomial distribution. We can use the binomial distribution to find the expected number of chestnuts falling per day. Obviously, for each unit of time, the probability of chestnuts falling is P, so the overall expectation is:

Let’s make this value zeroSo, based on this, we can express p.

If we substitute this p into the original formula, we can get:

As we said, in order to satisfy the binomial distribution, we need to keep the unit time as small as possible, so that we don’t have two chestnuts falling at the same time. So this n should be as big as possible, and we can use the limit that we learned before, as n approaches infinity, so this becomes a limit problem.

Let’s calculate this limit:

Let’s break this limit apart, where:

Therefore, we can get:

This is thePoisson distribution probability density functionSo the probability of dropping k examples in one day is 1.

In other words, the Poisson distribution is the result of dividing time infinitely, and then applying the binomial distribution to the mathematical limit. Essentially, the kernel is still a binomial distribution. The reason why we use the Poisson distribution is, when n is very large and P is very small, it’s very difficult to use the binomial distribution, because the power is very large, and then it’s very convenient to use the Poisson distribution to approximate the probability.

Endings and sublimation

According to the derived results, we feel that poisson distribution can be used as long as n is very large and P is very small. But this is only a perception, and there is a rigorous statistical definition of the problem. Let’s take a look at the strict conditions of use, which are about three of these.

When time is segmented wirelessly, the probability of events occurring within a period close to 0 is proportional to time.
The probability of the same event occurring twice in every infinitesimal period of time is infinitely close to zero
Whether events occur independently of each other in different time periods

Finally, let’s look at an example from the book to get a feel for the poisson distribution in action. Let’s say we have a batch of parts that have a defect rate of 0.1 percent, or 1 in 1,000. Could you tell me the probability that out of a thousand products we produce at least two defective items?

This should be a very simple problem, asking for the probability of two or more defective items, we just need to calculate the probability of only the parts and one defective item, and subtract them by 1. So let’s start with n and p:

We substitute the formula for the Poisson distribution:

If we wanted to use the binomial distribution, we would have to calculate 0.999 to the thousandth power, which is obviously very complicated, and that’s what the Poisson distribution is all about.

Today’s article is here, the original is not easy, pay attention to me, get more quality articles

Explain the confusing Poisson distribution with a chestnut

Take a chestnut

Derive poisson distribution

Endings and sublimation

Related Posts

Global graph optimization: another magic tool to improve the inference performance of the MegEngine model

Tensorflow 1.x Tutorial — Simple Classification Model

Advanced visualization artifact Plotly play bar graph