Hope to turn learning into a game, but learning is not a game after all. Rather, they spontaneously felt themselves addicted to pytorch.
How can we use Pytorch to implement a network? In fact, nowadays the network is more and more complicated, and it is difficult to make clear the author’s thoughts at a glance. How can we read the implementation of some popular papers at present? In fact, we can not read the basic knowledge is usually not solid, not like the author step by step, but inserted into the industry, no hurry, today we will start from the simple. Building a simple network usually requires the following basics.
- Task breakdown problem
- And then find an appropriate set of functions, usually we use a certain structure of the network to simulate the complex set of functions
- Collect appropriate data that accurately reflects our mission
- Then set a target, which for the neural network is the loss function, so that the model parameters are adjusted using the direction
- Set the policy for tuning parameters, that is, the optimizer
- Setting, that is, which indicators the task cares about more, these indicators can reflect the task ability, such as accuracy, accuracy and recall
Everything we do is based on PyTorch. People like PyTorch because it provides elegant module and class based design,
import torch
import torch.nn as nn
Copy the code
Torch. Nn is a network design module that provides classes for designing a neural network.
import torch.nn.functional as F
Copy the code
In this module, we provide some methods for defining convolution, activating functions, etc. After using Pytorch for some time, we may find that there is some crossover between torch. Nn. Functional and torch. In particular, torch. Nn provides methods that inherit from nn.Module and are stateful, whereas torch. Nn. Functional has no state and requires input parameters, but has internal parameters. For example, an Nn. Conv2d module would have some internal attributes, such as self.weight. However, F.Cone v2D just defines the operation and needs to pass all the parameters (including weights and biases).
Nn is more comprehensive than torch. Nn. Functional, so why do we need torch. But nn.functional doesn’t have any extra baggage, so it is more flexible than torch. Nn and therefore indispensable.
x = torch.randn(1.1)
w = nn.Parameter(torch.randn(1.1))
output = x * w
print(output)
Copy the code
If the variables need to be involved in gradient calculation, then the tensor that comes back from the calculation has a grad_fn property, and then the tensor that comes back from the output has the grad_fn property of the parameter.
tensor([[-0.1428]], grad_fn=<MulBackward0>)
Copy the code
output.backward()
print(w.grad)
Copy the code
Tensor ([[0.2470]])Copy the code
After completing one episode, we continue with the main mission and continue introducing the package we need
from torch.utils.data import DataLoader
import torchvision.datasets as datasets
import torchvision.transforms as transforms
Copy the code
Torch. Utils. data imports the DataLoader to load the data set. Later we will share how to customize the data set. Torchvision introduced this module earlier by using PyTorch to provide some predefined models or data sets for visual aspects. These tools help us to easily conduct visual orientation research.
Define the network
class Net(nn.Module) :
def __init__(self,input_size,num_classes) :
super(Net,self).__init__()
self.hidden_dim = 30
self.fc1 = nn.Linear(input_size,self.hidden_dim)
self.fc2 = nn.Linear(self.hidden_dim,num_classes)
def forward(self,x) :
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
Copy the code
Usually what we do when we init, when we initialize, is we define some basic module or layer in the network, and then what we do in the forward is we organize those layers or layers together effectively. The network structure consists of two full connections, with ReLU in the middle, input_size to specify the sample size and num_classes to specify the number of categories. And then you have to type x for forward propagation
model = Net(784.10)
x = torch.randn(64.784)
print(model(x).shape)
Copy the code
Define a network and specify that the input sample size is 28 x 28, which is 784, which is the MINIST dataset at a glance. Each input is 28 x 28 images, and then input this single channel 28 x 28 map flattened to 784 dimensions into the network.
X = Torch. Randn (64,784) where 64 represents a batch input size of 64 samples, and output a probability distribution of 10 categories for each sample.
torch.Size([64, 10])
Copy the code
Set up network running equipment
A language like Python can be used to assign values to a conditional statement, which reflects the fact that Python is an interpretive language.
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
Copy the code
Setting hyperparameters
input_size = 784
num_classes = 10
learning_rate = 0.001
batch_size = 64
num_epochs = 1
Copy the code
These parameters are not controlled by the network and the network does not need to learn. They are the input sample size, sample category, learning rate, number of model samples input each time, and total learning times on the total sample. Epochs refers to The Times of passing all samples in the model.
Preparing the data set
train_dataset = datasets.MNIST(root='dataset/',train=True,transform=transforms.ToTensor(),download=True)
train_loader = DataLoader(dataset=train_dataset,batch_size=batch_size,shuffle=True)
test_dataset = datasets.MNIST(root='dataset/',train=False,transform=transforms.ToTensor(),download=True)
test_loader = DataLoader(dataset=test_dataset,batch_size=batch_size,shuffle=True)
Copy the code
The datasets module provides multiple datasets, including MNIST dataset. Root specifies the storage location of the dataset train means that the dataset is used for training samples; transform means preprocessing images; download means whether to download and load the dataset. The data set provides the way to load the data, and the data structure to get samples from the data set each time, and the DataLoader is the way to load the data, which is the number of batches we feed each time, and whether or not we feed the model out of order each time.
Initialization model
model = Net(input_size=input_size,num_classes=num_classes).to(device)
Copy the code
Loss functions and optimizers
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),lr=learning_rate)
Copy the code
The optimizer selects cross entropy, which is usually used to classify problems
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),lr=learning_rate)
Copy the code
Training model
Start training the model, it’s a long process, and it’s one of the most telling things about our experience, a little bit
for epoch in range(num_epochs):
for index,(data,targets) in enumerate(train_loader):
data = data.to(device)
targets = targets.to(device)
print(data.shape)
break
Copy the code
The dataLoader is similar to a generator. The dataLoader provides a batch size of sample data for each iteration
torch.Size([64, 1, 28, 28])
Copy the code
64 represents the number of images provided per lot, 1 represents the number of channels of the image, and 28 and 28 represent the width and height of the image respectively
for epoch in range(num_epochs):
for index,(data,targets) in enumerate(train_loader):
data = data.to(device)
targets = targets.to(device)
data = data.reshape(data.shape[0],-1)
# forward propagation
pred = model(data)
loss = criterion(pred,targets)
# Backpropagation
optimizer.zero_grad()
loss.backward()
# update parameters
optimizer.step()
Copy the code
validation
Verification is mainly to check whether the model updated by the current iteration is the best model by comparing a certain index after a certain number of iterations, so as to save the model.
def validation_acc(loader,model) :
num_correct = 0
num_samples = 0
model.eval(a)with torch.no_grad():
for x,y in loader:
x = x.to(device)
y = y.to(device)
x = x.reshape(x.shape[0],-1)
y_hat = model(x)
_,pred = y_hat.max(1)
num_correct += torch.eq(pred,y).sum()
num_samples += pred.size(0)
print(f"acc {float(num_correct)/float(num_samples)}")
acc = float(num_correct)/float(num_samples)
model.train()
return acc
Copy the code
The first step is to switch the model to eval mode. When the model is switched to EVAL, some special Layers or parts will have different behaviors with their training, such as Dropouts Layers and BatchNorm Layers. When evaluating and verifying, these Layers need to be closed. You also need to use torch.no_grad() to turn off gradient updates.
validation_acc(test_loader,model)
Copy the code
The output
0.9215 the accCopy the code