18-19 (1.2.1 Probability density)

1.2.1 Probability density

In addition to considering the probabilities defined on the discrete time set, we also wish to consider the probabilities on continuous variables. We will limit ourselves to relatively informal discussions. If the probability of the real variable XXX falling within the interval (x,x+ delta x)(x,x+ delta x)(x,x+ delta x)(x,x+ delta x) is denoted by p(x) delta xp(x) delta xp(x) delta x delta x delta x normal 0 delta x→0, So p(x)p(x)p(x) is called the probability density on XXX. See Figure 1.12. The probability that XXX is located in the interval (a,b)(a,b)(a,b) is given by the following formula:

P (x \ in (a, b)) = \ int_a ^ bp (x) dx \ tag} {1.24

Because the probability is non-negative, and because the value of XXX must be somewhere on the real axis, the probability density p(x)p(x)p(x) must satisfy both conditions

P (x) \ geq0 \ tag} {1.25

\ int_ ^ {- \ infty} \ \ infty p (x) dx = 1 tag} {1.26

Under the nonlinear variation of variables, due to the Jacobian factor, the probability density transformation differs from simple functions. For example, if we consider the variable x = g (y) (y) x x = g = g (y) changes, then the function f (x) f (x) f (x) into f (y) = f (g) (y) (y) = f f f (y) (y) (g) = f (g (y)). Now consider the probability density px(x)p_x(x)px(x), which corresponds to the density py(y)p_y(y)py(y) relative to the new variable YYy, which is sufficient to represent the fact that px(x)p_x(x)px(x) and PY (y)p_y(y)py(y) are different densities. For smaller delta x, delta delta value of x and x in (x, x plus delta x) (x, x + \ delta x) (x, x plus delta x) within the scope of the observed value will be converted to (y, y + delta y) (y, y + \ delta y) (y, y + delta y) range, The px (x) delta x ≃ py (y) delta yp_x \ \ simeq delta x (x) p_y (y) \ delta ypx (x) delta x ≃ py (y) delta y, accordingly

P (z) = \ int_ ^ {- \ infty} zp (x) dx \ tag} {1.28

It satisfies P'(x)= P (x)P'(x)= P (x)P'(x)= P (x), as shown in FIG. 1.12.

Figure 1.12 Probability of discrete variable Probability can be diffused into probability density p(x)p(x)p(x) on continuous variable XXX, And such that the probability of XXX in the interval (x, delta x)(x, delta x)(x, delta x) is given by p(x) delta x delta x delta xp(x) delta xp(x) delta xδx →0 delta x\rightarrow0 delta x→0. The probability density can be expressed as the derivative of the cumulative distribution function P(x)P(x)P(x).

If we have several continuous variables x1… ,xDx_1,… ,x_Dx1,… ,xD is represented by vector XXX, then we can define the joint probability density p(x)=p(x1… ,xD)p(x)=p(x_1,… ,x_D)p(x)=p(x1,… ,xD) the probability that x will fall into the infinitesimal volume δx\delta xδx containing the point XXX is given by p(x)δ XP (x)\delta XP (x)δx. This multivariate probability density must be satisfied

P (x) \ geq0 \ tag} {1.29

Dx = 1 \ \ int p (x) tag} {1.30

The integral is the integral over the entire XXX space. We can also consider joint probability distributions on combinations of discrete and continuous variables.

Note that if XXX is a discrete variable, then p(x)p(x)p(x) is sometimes called a probabilistic mass function because it can be viewed as a set of “probabilistic masses” concentrated at the allowable values of XXX.

Probability and product rules and Bayes’ theorem also apply to probability densities, or combinations of discrete and continuous variables. For example, if XXX and YYy are two real variables, then the sum rule and product rule are of the form

P (x) = \ int p (x, y) dy \ tag} {1.31

P (x, y) = p (y | x) p (x) \ tag} {1.32

Formal proofs of continuous variables and product rules require a branch of mathematics called measurement theory and are outside the scope of this book. However, its validity can be seen informally by dividing each real variable into wide intervals, taking into account discrete probability distributions over these intervals. Take the limit δ →0\Delta\rightarrow0 δ →0 then convert the sum to an integral and give the desired result.

1.2.1 Probability density

Related Posts

Final summary of machine learning

The flow of Python data analysis

PyTorch creates the linear model and generates the corresponding dimension image