Generation model and discriminant model
Simple concept
The goal of supervised learning is to learn to get a model, through which to get a specific output for a given output, so as to predict the category of the data. This model could be calledclassifier. And the function of this model is, in generalOr is it(In mathematical statistics, the random variable is, the sample is).
For the decision functionType. You need to set a thresholdUsed of judgment.
For a conditional probability distributionSince the probability belonging to all types is calculated, the one with the highest probability is selected. That’s it.
The connection between the two:
The two are essentially the same.
When the network is usedForm, training time using is, the objective function enables the network to output and the real label (generally usedCode) the closest, which is actually a maximum likelihood idea. For a given, the training network makes its output and real labelsTo approximate (maximize the probability of its occurrence) is to maximizeOr is it(Likelihood is right herehappenFor example, the probability of an event occurring will reach an extreme value. So, the output here is essentially.
Of course, when you take“, just use it directlynatureFunction…
Generation method and discriminant method
Supervised learning can be divided into Generative approach and Discriminative approach. The models established are generation model and discriminant model respectively.
Discriminant model
Direct learning from the data leads to a discriminant function (or). Typical discriminant models include: K-nearest neighbor, support vector machine, decision tree… The discriminant model focuses only on how to classify (how to find the optimal classification surface by feature mapping and differentiation of a given data space). The model mainly reflects the differences between different categories. The discriminant model directly models the prediction, which has high efficiency and good effect.
Generate models
Learning the joint probability density distribution from the data(the probability density distribution function is used for sampling to produce more data for the data set), and then calculated according to The Bayesian formulaAs a predictive model, that is, a generative model:. The generation model requires an infinite number of samples to reach the predictions of the theory, because for, many samples are needed to make it reliable. Typical generation models include naive Bayes and hidden Markov models. The generation model is concerned with the data itself, unlike the discriminant model, which is concerned with the optimal classification interface. The generation model can also be used in models with hidden layers where the discriminant model is not available.
The correspondence between the generative model and the discriminant model in the deep network
Deep networks can simulate many probability distribution functions.
Discriminant modelThe output of the classification network fits. Suppose the parameters of the network areAccording to the maximum likelihood principle, the input of the network is, the output of the network is. The mathematical expression is:.
Generate modelsThe network fits is, joint probability density distribution function, and then reuseTo test. The generation model here is a very narrow concept!! (Because this is only one case where the generation model addresses classification in supervised learning). Generative Model In practice, Generative Model is a concept in probability statistics and machine learning that refers to a series of models used to randomly generate observable data. The generation model has two basic functions, one is to learn a probability distribution, that is, the density estimation problem, and the other is to generate data. For supervised learning, the typical generation models include naive Bayes, hidden Markov and Gaussian mixture. These models are all rightDirect modeling, and finally use Bayesian inference to get the category of the data. Generation model in the broad sense is to model the data itself to generate new data (GAN, VAE, etc.). For example, the generation of VAE images takes the form of hidden layer variables:. Monte Carlo approximationSo, what you end up with is, in which theFrom sampling once. Generation models are used to generate data, especially for generating images, so where does this show up? If the network can be rightI model it and I get oneAnd that the, then we can use the probability distribution function inter-sampling to get new data (note that there is no label added here), then we can get a generation model.
Modeling probability with deep networks: Both of the above are the establishment of the probability of the deep network, but it should be noted that the output of the network may not be) or. For example,When Gaussian is used, the network can be correctThe output. Understand that there is a difference between modeling objects and output from a network!! Don’t get confused!!) Moreover, when the data in the network is output, there is a logical transmission, such as the formation of prior probability and a posteriori probability.
The following VAE examples are used to explain the deep network to the probability modeling:
VAE to true posterior probabilityMLP is used for fitting, and network output is, network modeling is, the output of the network is. The model here is called a recognize. The model. Network is the second half of the modeling of P (X | Z) is the output of the networkAnd theManually set to a small value, the network eventually models the probability distribution function. The input to the network isAnd theIs the probability sampled from the output of the previous recognition. (Here can be directly understood as fromCan be sampled in). Therefore, the final output logic of the network is. If the final result is sampled only once (y is sampled only once because the final result depends on the sampling of Z), then you get. Because whenWhen the value is very small, the output of the network isSo, the samples are very close, so we can think of the output of the network as X. You don’t have toSampling is not possible at this time. Because sampling needs to know the specific expression?? . The final output of the network is approximate, network modeling is. So that’s the specific analysis. Read my next post about VAE.
There will be a lot of problems in my discussion, I hope you point out, I try to correct !!!!
references
Neural Networks and Deep Learning: A probabilistic Graph model/Deep Generation Model/Deep reinforcement Learning
[Machine Learning Foundation] Generation model and discriminant model
Generation model and discriminant model
What is the difference between a machine learning “decision model” and a machine learning “generation model”?