This is the 18th day of my participation in the August Genwen Challenge.More challenges in August

Deep learning modeling process

In the actual deep learning modeling process, both manual implementation and library tuning implementation need to follow the general process of deep learning modeling.

  • Model selection: In the field of deep learning, the process of model selection is to determine the basic structure of neural network, the number of layers of neural network, the number of neurons in each layer and the selection of activation function
  • Determine the objective function: after determining the basic mechanism of the model, determine the objective function according to the actual situation of the model, build a function project containing model parameters, and the value of the equation is consistent with the purpose of modeling, in most cases, we solve the minimum of the equation
  • Selection of optimization methods: to solve the minimum value around the objective function, according to the function characteristics of the loss function, taking into account the consumption of the actual calculation force, select the optimal tool,
  • Model training: train the model to be available, use the optimization method to solve the loss function, and get a set of model functions, corresponding to the neural network, get a set of connected neuron parameter values.

Two linear regression modeling of manual implementation

import random import matplotlib as mpl import matplotlib.pyplot as plt import numpy as np import torch from torch import  nn,optim import torch.nn.functional as F from torch.utils.data import Dataset,TensorDataset,DataLoader from torch.utils.tensorboard import SummaryWriterCopy the code

The function that was created during function creation

Def tensorGenReg(num_examples=1000,w=[2,-1,1],bias=True,delta=0.01,deg=1): """ Param num_examples: the amount of data to create the dataset param w: intercept of the eigenvector param bias: whether there is a solution param delta: Perturbation term value param deg: number of equations "" if bias==True: Num_inputs =len(w)-1 features_true=torch. Randn (num_examples,num_inputs) # Does not contain feature tensors for columns that are all ones 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 labels_true=torch.pow(features_true,deg)*w_true+b_true else: labels_true=torch.mm(torch.pow(features_true,deg),w_true)+b_true features=torch.cat((features_true,torch.ones(len(features_true),1)),1) labels=labels_true+torch.randn(size=labels_true.shape)*delta else: Num_inputs = len (w) the features. = the torch randn (num_examples, num_inputs) w_true = torch. The tensor (w). Reshape (1, 1). The float () the if num_inputs==1: labels_true=torch.pow(features,deg)*w_true else: labels_true=torch.mm(torch.pow(features,deg),w_true) labels=labels_true+torch.randn(size=labels_true.shape)*delta return Features,labels # Create function def for classifying datasets TensorGenCla (num_examples = 500, num_inputs = 2, num_class = 3, deg_dispersion = [4, 2], bias, = False) : "" param num_examples: number of data in each category Parma num_inputs: number of attributes in a data set param num_class: total number of tags in a data set param deG_DISPERSION: The first parameter represents the mean value of each category, and the second parameter represents the standard deviation of random number param bias: Cluster_l = Torch. Empty (num_examples,1) # The shape of each tag tensor mean_= deG_dispersion [0] # The reference value of the mean value of each feature tensor Std_ = deG_dispersion [1] # lf=[] ll=[] k=mean_*(num_class-1)/2 for I in range(num_class): data_temp=torch.normal(i*mean_-k,std_,size=(num_examples,num_inputs)) lf.append(data_temp) Labels_temp =torch.full_like(Cluster_l, I) # Labels_temp tag ll.append(labels_temp) features=torch.cat(lf).float() labels=torch.cat(ll).long() if bias==True: Features =torch. Cat ((features,torch. Ones (len(features),1)),1) # add a column full of 1s data_iter(batch_size,features,labels): """ Param batch_size: the data contained in each subset param features: The input feature tensor param labels: Num_examples =len(features) indices=list(range(num_examples)) random. Shuffle (indices) L =[] for I in range(0,num_examples,batch_size): j=torch.tensor(indices[i:min(i+batch_size,num_examples)]) l.append([torch.index_select(features,0,j),torch.index_select(labels,0,j)]) return lCopy the code

2.1 Modeling Process

2.1.1 Model selection

Create a regression class dataset whose true relationship is 𝑦=2𝑥1−𝑥2+1y= 2X1 −x2+1 and whose perturbation term is not very large. Around the modeling goal, we can construct a neural network containing only one layer for modeling.

torch.manual_seed(420)
features,labels=tensorGenReg()
def linreg(X,w):
    return torch.mm(X,w)
Copy the code

2.1.2 Determine the objective function

Use MSE as the loss function, which is the objective function

Def squared_Loss (y_hat,y): num_= y.nomel () sse= Torch. Sum ((y_hatCopy the code

2.1.3 Define the optimization algorithm

Small batch gradient descent is used for solving, and each iteration process is (parameter – learning rate * gradient).

def sgd(params,lr):
    params.data-=lr*params.grad
    parmas.grad.zero_()
Copy the code

** The in-place operation of differentiable tensors will lead to the problem that the system cannot distinguish leaf nodes from other nodes. For example, to create a differentiable W, w is also a leaf node. After differentiable is enabled, all calculations of W will be included in the calculation diagram. Replacing the original value of w with the newly generated value will result in an error because the system will not be able to determine whether W is a leaf node or another node.

w=torch.tensor(2.,requires_grad=True)
print(w)
print(w.is_leaf)
w1=w*2
print(w1)
w=torch.tensro(2.,requiers_grad=True)
w-=w*2
Copy the code

In a calculation graph, there is no correlation operation to compute the derivative of the back propagation of the leaf node, so the calculation graph loses its core value. Therefore, in actual operation, related operations that may cause leaf node loss should be avoided as far as possible. There are three methods for modifying leaf node values:

  • Suspends tracking with torch.no_grad()
  • W.datach_ () generates a new variable
  • .data returns the value of the differentiable tensor to avoid being traced during modification

2.1.4 Training model

Def SGD (params, lr): Params.data -= lr * params.grad params.grad.zero_() # Set random seed torch. Manual_seed (300) # initialize core parameter batch_size=10 # The amount of each small batch lr=0.03 # the learning rate num_epochs=3 # The training process traversal data W = Torch. Zeros (3,1,requires_grad=True) # The differentiable tensor # The variance of the model participating in the training net= Linreg Loss =squared_loss #MSE as a loss function for epoch in range(num_epochs): for X,y in data_iter(batch_size,features,labels): l=loss(net(X,w),y) l.backward() sgd(w,lr) train_l=loss(net(features,w),labels) print('epoch %d, loss%f'%(epoch+1,train_l)) print("*"*100) print(w)Copy the code

2.2 Use tensorBoard to record changes in Loss during iteration

Writer =SummaryWriter(log_DIR ='reg_loss') # Initialize core parameters batch_size=10 LR =0.03 num_epochs=3 Zeros (3,1,requires_grad=True) net=linreg loss=squared_loss for the epoch in range(num_epochs): for X,y in data_iter(batch_size,features,labels): l=loss(net(X,w),y) l.backward() sgd(w,lr) train_l=loss(net(features,w),labels) writer.add_scalar('mul',train_l,epoch)Copy the code

Enter in terminals

tensorboard --logdir="reg_loss"
Copy the code

Open your browser and view the drawing image

You can also set num_epochs=3 times to view the MSE change.

Fast implementation of trilinear regression

Modeling implementations can call functions and classes in PyTorch to complete modeling directly. , of course, the process is in strict accordance with the depth of the study the modeling process to complete the model building, however, due to the particularity of deep learning, many times we can neither to tabular view of the actual data, also can’t accurate control model within each step of the operation, and need to create a large number of classes (read data or modeling), and the parameters of the mass input, Have caused no small trouble to the beginner’s study. Therefore, through the following library modeling exercise.

3.1 Tuning library modeling process

3.1.1 Defining core parameters

Batch_size =10 lr=0.03 num_epochs=3Copy the code

3.1.2 Data Preparation

Manual_seed (300) features,labels=tensorGenReg() features=features[:,:-1 Data= TensorDataset(features,labels) batchData=DataLoader(data,batch_size=batch_size,shuffle=TrueCopy the code

3.1.3 Define the model

class LR(nn.Module): def __init__(self,in_features=2,out_features=1): Super (LR,self).__init__() self.linear=nn.Linear(in_features,out_features) def forward(self,x): Out =self.linear(x) return out LR_model=LR()Copy the code

3.1.4 Define the loss function

Criterion = nn.mseloss ()Copy the code

3.1.5 Define optimization methods

Optimizer = optim.sgd (lr_model.parameters (),lr=0.03)Copy the code

3.1.6 Model training

# model training def fit (.net, criterion, the optimizer, batchdata, epochs) : for epoch in range (epochs) : for the X, y in batchdata: yhat=net.forward(X) loss=criterion(yhat,y) optimizer.zero_grad() loss.backward() optimizer.step() writer.add_scalar('loss',loss,global_step=epoch)Copy the code

3.1.7 Perform model training

Manual_seed (300) fit(net=LR_model,criterion=criterion,optimizer=optimizer,batchdata= batchdata ,epochs=num_epochs )Copy the code

3.1.8 Check the training effect

Print (list(lr_model.parameters ()) print('*'*50)) print('*'*50 criterion(LR_model(features),labels)Copy the code

As the data itself is constructed according to the basic rule 𝑦=2𝑥1−𝑥2+1 plus the disturbance term, the model has a good effect through the parameters completed by training. Of course, in the real scene, we cannot obtain the real data distribution rules from the perspective of God, and then judge the quality of the model by comparing the model. In this case, we need to clarify the model evaluation indicators.