Bayesian decision making
The first thing we all know is classic Bayes’ formula
p(w)
P (x | w) is a kind of conditional probability, said in a category under the premise of the probability of something happening;
| x and p (w) as the a posteriori probability, said something happened, and it belongs to one kind of probability,
With this posterior probability, we can classify the sample. The higher the posteriori probability, the more likely it is that something falls into this category, the better chance we have of fitting it into this category.
Known: let’s look at a straightforward example: in the summer, the probability of a park men wear sandals is 1/2, the probability of women wear sandals for 2/3, and the ratio of 2:1, usually park question: if you met a random in the park people wear sandals, could you tell me his probability of sex for male or female, respectively?
So for the purposes of the question, what is the probability that something happens, that it is related to a certain category, which is the posterior probability.
Suppose: W1 = male, w2= female, x= sandals
From what is known:
Questions lead
But not in the practical problems are so lucky, we can obtain the data may be only a limited number of sample data, and the prior probability p (wi) and class (all kinds of population distribution) conditional probability p (x | wi) is unknown. When classifying according to the only sample data, a feasible method is that we need to estimate the prior probability and conditional probability, and then apply the Bayesian classifier.
The estimation of prior probability is relatively simple. 1. The natural state to which each sample belongs is known (supervised learning). 2. Rely on experience; 3. Estimate the frequency of each type in the training sample.
Estimation of quasi-conditional probability (very difficult) for the following reasons: the probability density function contains all information about a random variable; Sample data may be small; The eigenvector x might have a lot of dimensions and so on. Anyway, it’s hard to estimate the density function of conditional probability directly. The solution is that the estimate is completely unknown probability density p (x | wi) into the estimated parameters.
Here, the probability density estimation problem is transformed into parameter estimation problem, and maximum likelihood estimation is a parameter estimation method.
And of course, the probability density function is very important, because if the model is right, if the sample area is infinite, we’ll get a fairly accurate estimate, but if the model is wrong, then it doesn’t make much sense to estimate half a day.