Training classifier
In early 2019, ApacheCN organized volunteers to translate PyTorch1.0 Chinese documents (Making the address), as well as being officially licensed by PyTorch, I’m sure there are already a lot of people out thereChinese Official websiteYes. But so farproofreadingWe are short of staff, so I hope everyone will participate. We’ve been in an email exchange with PyTorch’s Bruce Lin for a while. At an appropriate time, we will organize volunteers to work on other PyTorch projects. Please join us and follow us. More hope our series of work can be helpful to you.
Translator: bat67
Proofread by FontTian
So far, we’ve seen how to define networks, calculate losses, and update network weights. So now you might be thinking,
What about the data?
In general, when you have to work with image, text, audio, or video data, you can use the Python standard library to load the data into numpy arrays. Then translate that array into torch.*Tensor.
- For images, there are Pillow, OpenCV and other packages available
- For audio, there are packages like Scipy and Librosa available
- For text, whether native Python or Cython-based, you can use NLTK and SpaCy
Especially for the visual aspect, we created a package called TorchVision, which contains data loaders for Imagenet, CIFAR10, MNIST and other common data sets, as well as image data deformation operations. Namely torchvision. Datasets and torch. Utils. Data. The DataLoader.
This provides great convenience in avoiding boilerplate code.
In this tutorial, we will use the CIFAR10 dataset, which has the following categories: “aircraft”, “car”, “bird”, “cat”, “deer”, “dog”, “frog”, “horse”, “boat”, “truck”, etc. The image data in CIFAR-10 is 3x32x32, or three-channel color map, with a size of 32×32 pixels.
Train an image classifier
We will do the following steps in order:
- through
torchvision
Load the training and test data set in CIFAR10 and standardize the data - Define convolutional neural networks
- Defining loss function
- Train networks with training data
- Test the network with test data
1. Load and standardize CIFAR10
Loading CIFAR10 with TorchVision is super easy.
import torch
import torchvision
import torchvision.transforms as transforms
Copy the code
The output of torchVision dataset after loading is a PILImage with a range of [0, 1]. We normalized it to a tensor in the range [-1, 1].
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5.0.5.0.5), (0.5.0.5.0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)
classes = ('plane'.'car'.'bird'.'cat'.'deer'.'dog'.'frog'.'horse'.'ship'.'truck')
Copy the code
Output:
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz
Files already downloaded and verified
Copy the code
For fun, now let’s visualize some of the training data.
import matplotlib.pyplot as plt
import numpy as np
# Function to output image
def imshow(img) :
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1.2.0)))
plt.show()
Get training pictures randomly
dataiter = iter(trainloader)
images, labels = dataiter.next(a)# display images
imshow(torchvision.utils.make_grid(images))
# Print image tags
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
Copy the code
Output:
horse horse horse car
Copy the code
2. Define convolutional neural network
Take the neural network defined in the previous neural network section and modify its input to a 3-channel image (instead of the single-channel image originally defined).
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module) :
def __init__(self) :
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3.6.5)
self.pool = nn.MaxPool2d(2.2)
self.conv2 = nn.Conv2d(6.16.5)
self.fc1 = nn.Linear(16 * 5 * 5.120)
self.fc2 = nn.Linear(120.84)
self.fc3 = nn.Linear(84.10)
def forward(self, x) :
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1.16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
Copy the code
3. Define loss functions and optimizers
We use the classification of cross entropy loss and random gradient descent (using momentum).
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
Copy the code
4. Training networks
Things are starting to get interesting. We just need to iterate through our data iterator and “feed” the input to the network and optimization function.
for epoch in range(2) :# loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0) :# get the inputs
inputs, labels = data
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000= =1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0
print('Finished Training')
Copy the code
Output:
[1.2000] loss: 2.182
[1.4000] loss: 1.819
[1.6000] loss: 1.648
[1.8000] loss: 1.569
[1.10000] loss: 1.511
[1.12000] loss: 1.473
[2.2000] loss: 1.414
[2.4000] loss: 1.365
[2.6000] loss: 1.358
[2.8000] loss: 1.322
[2.10000] loss: 1.298
[2.12000] loss: 1.282
Finished Training
Copy the code
5. Use test data to test the network
We have trained the network twice on the training set. But we need to check to see if the network has learned anything.
We will check this problem by predicting the labels output by the neural network and make a ground-truth comparison with the correct sample. If the prediction is correct, we add the sample to the list of correct predictions.
Ok, step one. Let’s show images from the test set to familiarize ourselves.
dataiter = iter(testloader)
images, labels = dataiter.next(a)# output image
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: '.' '.join('%5s' % classes[labels[j]] for j in range(4)))
Copy the code
GroundTruth: cat ship ship plane
Copy the code
Ok, now let’s look at what the neural network thinks the above example is:
outputs = net(images)
Copy the code
The output is the magnitude of 10 categories. The higher the value of a class, the more the network thinks the image belongs to that particular class. Let’s get the highest magnitude subscript/index;
_, predicted = torch.max(outputs, 1)
print('Predicted: '.' '.join('%5s' % classes[predicted[j]] for j in range(4)))
Copy the code
Output:
Predicted: dog ship ship plane
Copy the code
It worked out pretty well.
Let’s see how the network performs on the whole data set.
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
Copy the code
Output:
Accuracy of the network on the 10000 test images: 55 %
Copy the code
This is much better than random selection (that is, a random selection of a class out of 10, correct 10% of the time). Looks like the web has learned something.
So which classes are doing well? What are the poorly performing classes?
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs, 1)
c = (predicted == labels).squeeze()
for i in range(4):
label = labels[i]
class_correct[label] += c[i].item()
class_total[label] += 1
for i in range(10) :print('Accuracy of %5s : %2d %%' % (
classes[i], 100 * class_correct[i] / class_total[i]))
Copy the code
Output:
Accuracy of plane : 70 %
Accuracy of car : 70 %
Accuracy of bird : 28 %
Accuracy of cat : 25 %
Accuracy of deer : 37 %
Accuracy of dog : 60 %
Accuracy of frog : 66 %
Accuracy of horse : 62 %
Accuracy of ship : 69 %
Accuracy of truck : 61 %
Copy the code
Ok, what’s next?
How do you run a neural network on a GPU?
Train on GPU
Like passing a tensor to a GPU, you can transfer a neural network to the GPU in this way.
If we have CUDA available, let’s define the first device as a visible CUDA device:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Assuming that we are on a CUDA machine, this should print a CUDA device:
print(device)
Copy the code
Output:
cuda:0
Copy the code
The rest of this section assumes that device is CUDA.
These methods then recursively traverse all modules and convert their arguments and buffers to CUDA tensors:
net.to(device)
Copy the code
Keep in mind that we have to feed the input and target into the GPU at every step:
inputs, labels = inputs.to(device), labels.to(device)
Copy the code
Why don’t we feel the huge acceleration compared to the CPU? Because our network is so small.
Try this: widen your network (note that the second parameter of the first nn.Conv2d is the same as the first parameter of the second nn.Conv2d) and see how much acceleration you can get.
Goals Achieved:
- Understanding PyTorch’s Tensor libraries and neural networks at a higher level
- Train a small neural network to categorize images
Train on multiple Gpus
For greater acceleration with all of your Gpus, check out Optional: Data Parallelism.
What’s next?
- Train neural nets to play video games
- Train a state-of-the-art ResNet network on imagenet
- Train a face generator using Generative Adversarial Networks
- Train a word-level language model using Recurrent LSTM networks
- More examples
- More tutorials
- Discuss PyTorch on the Forums
- Chat with other users on Slack
l using Recurrent LSTM networks](github.com/pytorch/exa…) - More examples
- More tutorials
- Discuss PyTorch on the Forums
- Chat with other users on Slack