This paper introduces the generalized linear model, among which linear regression, logistic regression and Softmax regression all belong to the generalized linear model. In order to maximize the expectation, the objective function of linear regression, Logistic regression and Softmax regression is derived, which further emphasizes the probability interpretation of the model.

The author | loevinger \

Edit | yuquanle

Generalized linear model

From the perspective of linear regression, Logistic regression, Softmax regression and probability interpretation of maximum entropy, we can find that linear regression is based on the result of Gaussian distribution + maximum likelihood estimation, logistic regression is the result of Bernoulli distribution + logarithmic maximum likelihood estimation, Softmax regression is the result of multinomial distribution + logarithmic maximum likelihood estimation, and maximum entropy is the result of expectation + logarithmic likelihood estimation. The first three can be viewed in terms of generalized linear models.

A, Exponential distribution family

The exponential distribution family refers to the probability distribution that can be expressed in exponential form, of the following form: \

Where is the natural parameter of the distribution, is the sufficient statistic, usually. When the parameters are fixed, a family of functions with parameters is defined.

In fact, most probability distributions belong to the exponential distribution family, such as:

1) Bernoulli distribution 0-1 problem

2) Binomial distribution, multinomial distribution multiple value test

3) Poisson distribution counting process

4) Gamma and exponential distribution

5) distribution

6) Dirichlet distribution

7) Gaussian distribution

Now we express the Gaussian and Bernoulli distributions in the form of exponential distribution families:

Gaussian distribution: \

The corresponding exponential distribution family is: \

Bernoulli distribution: \

The corresponding exponential distribution family is: \

B, Generalized linear module type* * * *

After understanding the exponential distribution family, let’s look at the formal definition and hypothesis of the generalized linear model:

1) Given sample X and parameters, sample classification Y obeys a distribution of the exponential distribution family.

2) Given a x, our objective function is:

3)

The first assumption is to be able to discuss the probability of y within the exponential distribution, the second assumption is to make the predicted value follow the mean for a distribution of actual value, and the third assumption is to design the decision function (model) to be linear.

The exponential family distribution form of gaussian distribution and the definition of generalized linear model with linear regression model is: \

Also defined by the Bernoulli distribution of the exponential family distribution form with the generalized linear model there are logistic regression models for (explains why sigmoid function) : \

Therefore, in the generalized linear model, the decision function is linear based on the third assumption of the generalized linear model, and the final model depends on what kind of distribution the model obeys, such as the Gaussian distribution, Bernoulli distribution.

Similarly, we apply logistic regression to a set of definitions of Softmax regression, and then look at the softmax regression corresponding to multinomial distribution: \

Where is the probability represented, is an indicator function. If is true, the value is, otherwise is, and the definition of vectorization in SoftMax is adopted.

The corresponding exponential distribution family is: \

By deducting: \

For the sake of definition, since the sum of the probabilities of all values of the multinomial distribution is 1, we have: \

So there are:

From the second hypothesis of the generalized linear model, the third linear hypothesis is substituted as: \

Finally, the objective function with softmax estimated by maximum likelihood is as follows: \

At this point, the generalized linear model explains linear regression, Logistic regression and Softmax regression are basically finished. It can be seen that the linear function is based on the third hypothesis of the generalized linear model. Sigmoid function is used because of Bernoulli distribution, while Softmax regression is a high-dimensional extension of Logistic regression.

Highlights from the past

  • All those years of academic philanthropy. – You’re not alone

  • Suitable for beginners to enter the artificial intelligence route and information download

  • Ng machine learning course notes and resources (Github star 12000+, provide Baidu cloud image) \

  • Ng deep learning notes, videos and other resources (Github standard star 8500+, providing Baidu cloud image)

  • Python code implementation of Statistical Learning Methods (Github 7200+)

  • The Mathematical Essence of Machine Learning (Online reading edition) \

Note: If you join our wechat group or QQ group, please reply”Add group

To join Knowledge Planet (4300+ user ID: 92416895), please reply”Knowledge of the planet