“This is my fourth day of participating in the First Challenge 2022. For more details: First Challenge 2022.”

You may not remember anything, but here are two things you should remember:

1. Logistic regression is not suitable for solving regression problems, but for solving classification problems. Don’t be fooled by the name!

2. Logistic regression = linear regression + Sigmoid function.

1. The Logistic distribution

From the image, the Logistic distribution is very similar to the normal distribution, but its tail is heavier (with higher kurtosis), and its distribution function and density function are as follows:


F ( x ) = P ( X Or less x ) = 1 1 + e ( x mu ) / gamma f ( x ) = F ( X Or less x ) = e ( x mu ) / gamma gamma ( 1 + e ( x mu ) / gamma ) 2 \begin{gathered} F(x)=P(X \leq x)=\frac{1}{1+e^{-(x-\mu) / \gamma}} \\ f(x)=F^{\prime}(X \leq x)=\frac{e^{-(x-\mu) / \gamma}}{\gamma\left(1+e^{-(x-\mu) / \gamma}\right)^{2}} \end{gathered}

Among them, μ\muμ is the position parameter, γ\gammaγ is the shape parameter, and the function of parameter is the same as normal distribution. In particular, when μ=0,γ=1\mu=0,\gamma=1μ=0,γ=1, the Logistic distribution function becomes the Sigmoid function.

2. Logistic regression

Logistic regression can be regarded as the combination of linear regression and Sigmoid function. Linear regression is responsible for fitting the relationship between independent variables and dependent variables (y=wx+ BY =wx+ BY =wx+ B), while Sigmoid function is responsible for transforming regression problems into classification problems. The transformation method is to take the regression result as the input value of Sigmoid function. When the input is greater than 0.5, it is classified into class 1, and when the input is less than 0.5, it is classified into class 0. Therefore, Logistic regression can also be regarded as a probability estimation.

3. Pytorch implementation

class LogisticRegression(torch.nn.Module): def __init__(self, num_features): Super (LogisticRegression, self).__init__() # call the constructor of the parent class self.linear = torch. Nn.Linear(num_features, 1) the self. The linear weight. Detach () zero_ () # weights are initialized to 0 self. The linear bias. Detach () zero_ () # def forward bias initialized to 0 (self, x) : logits = self.linear(x) probas = torch.sigmoid(logits) return probasCopy the code

In Pytorch, you can implement a custom model by inheriting the torch.nn.Module class, defining the constructs of each layer in __init__ and the connections of each layer in forward. This is the core of the model functionality and must be overridden. Otherwise, the model will fail to execute because it cannot find the connections between the layers.

Torch. Nn.Linear transforms the input data linearly, with weight and bias representing the weight and bias of the input data.

4. An example

Taking the classification of iris data as an example, we did a simple Logistic regression classification task.

  • Data set acquisition and partitioning
import matplotlib.pyplot as plt import numpy as np from io import BytesIO import torch import torch.nn.functional as F ds = np.lib.DataSource() fp = ds.open('http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data') x = np.genfromtxt(BytesIO(fp.read().encode()), delimiter=',', usecols=range(2), max_rows=100) y = np.zeros(100) y[50:] = 1 np.random.seed(1) idx = np.arange(y.shape[0]) np.random.shuffle(idx) X_test, y_test = x[idx[:25]], y[idx[:25]] X_train, y_train = x[idx[25:]], y[idx[25:]] mu, std = np.mean(X_train, axis=0), np.std(X_train, axis=0) X_train, X_test = (X_train - mu) / std, (X_test - mu) / std fig, ax = plt.subplots(1, 2, Figsize = ax [0] (7, 2.5)). The scatter (X_train [y_train = = 1, 0], X_train [y_train = = 1, 1]) ax [0]. Scatter (X_train [y_train = = 0, 0], X_train[y_train == 0, 1]) ax[1].scatter(X_test[y_test == 1, 0], X_test[y_test == 1, 1]) ax[1].scatter(X_test[y_test == 0, 0], X_test[y_test == 0, 1]) plt.show()Copy the code

The actual data is 152 rows, so we only take the first 100 rows. Divide training set and test set according to subscript random shuffling. The training set has 75 lines and the test set has 25 lines.

  • Model training
model = LogisticRegression(num_features=2).to(device) cost_fn = torch.nn.BCELoss(reduction='sum') optimizer = Torch.optim.SGD(model.parameters(), lr=0.1) def custom_WHERE (cond, x_1, x_2): return (cond * x_1) + ((1-cond) * x_2) def comp_accuracy(label_var, pred_probas): Pred_labels = custom_WHERE ((pred_probas > 0.5).float(), 1, 0).view(-1) acc = torch.sum(pred_labels == label_var.view(-1)).float() / label_var.size(0) return acc num_epochs = 10 X_train_tensor = torch.tensor(X_train, dtype=torch.float32, device=device) y_train_tensor = torch.tensor(y_train, dtype=torch.float32, device=device).view(-1, 1) for epoch in range(num_epochs): #### Compute outputs #### out = model(X_train_tensor) #### Compute gradients #### cost = cost_fn(out, y_train_tensor) optimizer.zero_grad() cost.backward() #### Update weights #### optimizer.step() #### Logging #### pred_probas = model(X_train_tensor) acc = comp_accuracy(y_train_tensor, pred_probas) print('Epoch: %03d' % (epoch + 1), end="") print(' | Train ACC: %.3f' % acc, end="") print(' | Cost: %.3f' % cost_fn(pred_probas, y_train_tensor)) print('\nModel parameters:') print(' Weights: %s' % model.linear.weight) print(' Bias: %s' % model.linear.bias)Copy the code

BCELoss calculates the binary cross entropy loss function between the target value and the predicted value, and the mathematical formula is as follows:


L o s s = w [ y l o g ( y ) + ( 1 y ) l o g ( 1 y ) ] Loss = -w[ylog(y’)+(1-y)log(1-y’)]

Where, WWW, YYy, y’y ‘y ‘represent weight, label (target), and predicted probability value (input probabilities) respectively. The reduction value of sum indicates the sum of the lost values of the sample.

The output is as follows:

Epoch: 001 | "Train" the ACC: 0.987 | Cost: 5.581 Epoch: 002 | "Train" the ACC: 0.987 | Cost: 4.882 Epoch: 003 | "Train" the ACC: 004 | | 1.000 Cost: 4.381 Epoch: "Train" the ACC: 1.000 | Cost: 3.998 Epoch: 005 | "Train" the ACC: 1.000 | Cost: 3.693 Epoch: 006 | "Train" the ACC: 1.000 | Cost: 3.443 Epoch: 007 | "Train" the ACC: 1.000 | Cost: 3.232 Epoch: 008 | "Train" the ACC: 1.000 | Cost: 3.052 Epoch: 009 | "Train" the ACC: 1.000 | Cost: 2.896 Epoch: 010 | "Train" the ACC: 1.000 | Cost: 2.758 the Model parameters: Weights: Parameter containing: tensor([[4.2267, -2.9613]], requires_grad=True) Bias: Parameter containing: Tensor ([0.0994], requires_grad = True)Copy the code
  • Model to evaluate
X_test_tensor = torch.tensor(X_test, dtype=torch.float32, device=device)
y_test_tensor = torch.tensor(y_test, dtype=torch.float32, device=device)

pred_probas = model(X_test_tensor)
test_acc = comp_accuracy(y_test_tensor, pred_probas)

print('Test set accuracy: %.2f%%' % (test_acc*100))
Copy the code

The output is as follows:

The Test set accuracy: 100.00%