directory
Abstract
Import the libraries used by the project
Setting global Parameters
Image preprocessing
Read the data
Set up the model
Set up training and validation
validation
Complete code:
Abstract
ResNet (Residual Neural Network) was proposed by four Chinese including Kaiming He of Microsoft Research, successfully trained 152-layer Neural Network by using ResNet Unit, and won the champion in ILSVRC2015 competition. The error rate of top5 is 3.57%, and the number of parameters is lower than VGGNet, so the effect is very obvious.
The innovation of the model lies in the idea of residual learning, which adds a direct channel in the network to transmit the original input information directly to the following layers, as shown in the figure below:
In traditional convolutional networks or fully connected networks, there are more or less problems such as information loss and loss during information transmission, and at the same time, gradient disappearance or gradient explosion will result in the failure of training of deep networks. ResNet solves this problem to some extent by directly bypassing the input information to the output to protect the integrity of the information. The entire network only needs to learn the part of the difference between the input and output, simplifying the learning objectives and difficulties. The pairing of VGGNet and ResNet is shown below. The biggest difference with ResNet is that there are many bypasses that connect the input directly to the next layer, which is also known as shortcut or Skip connections.
In ResNet network structure, two residual modules will be used, one is two 3*3 convolutional networks connected together as a residual module, and the other is 1*1, 3*3 and 1*1 convolutional networks connected together as a residual module. As shown below:
ResNet has different network layers. The most common ones are 18-layer, 34-layer, 50-layer, 101-layer, and 152-layer. They are all made up of the residual modules stacked on top of each other. The following figure shows the different ResNet models.
ResNet18 is used for image classification and pytorch integrated model is used for the model.
See this article for details. It goes into a lot of detail. But we can use the official or the official is preferred in actual combat projects, there are pre-training models, and some models have been optimized.
Rip ResNet – Relive ResNet (Pytorch) _AI Hao -CSDN blog
Import the libraries used by the project
import torch.optim as optim
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.optim
import torch.utils.data
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models
from effnetv2 import effnetv2_s
from torch.autograd import Variable
Copy the code
Setting global Parameters
Set BatchSize, learning rate, and EPOchs to check whether the CUDA environment exists. If not, set the BatchSize to CPU.
Set global parameters
modellr = 1e-4
BATCH_SIZE = 64
EPOCHS = 20
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
Copy the code
Image preprocessing
In image and processing, the transform of train data set and the transform of verification set are separated. Besides resize and normalization of train image processing, image enhancement can be set, such as rotation, random erasure and a series of operations, while image enhancement is not required for verification set. In addition, do not blindly enhance, unreasonable enhancement means are likely to bring negative effects, and even Loss does not converge.
Transforms = transforms.Compose([transforms.Resize((224, 224)), transforms.ToTensor(), Transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]) transform_test = transforms.Compose([transforms. 224)), transforms. ToTensor (), transforms. The Normalize ([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])Copy the code
Read the data
Read data using Pytorch’s default. The data catalog is shown below:
In the training set, 10,000 cat and dog images were taken from the data set of cat and dog wars, and the rest were put into the verification set.
# fetch data
dataset_train = datasets.ImageFolder('data/train', transform)
print(dataset_train.imgs)
# Label of the corresponding folder
print(dataset_train.class_to_idx)
dataset_test = datasets.ImageFolder('data/val', transform_test)
# Label of the corresponding folder
print(dataset_test.class_to_idx)
# import data
train_loader = torch.utils.data.DataLoader(dataset_train, batch_size=BATCH_SIZE, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset_test, batch_size=BATCH_SIZE, shuffle=False)
Copy the code
Set up the model
Cross entropy is used as loss, and resnet18 is used as the model. It is recommended to use the pre-training model. In the process of debugging, the pre-training model can quickly get a convergent model, and pretrained model can be set to True. Change the last layer’s full connection, set the category to 2, and place the model on DEVICE. The optimizer selects Adam.
Instantiate the model and move it to the GPU
criterion = nn.CrossEntropyLoss()
model = torchvision.models.resnet18(pretrained=False)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 2)
model.to(DEVICE)
# Choose simple violent Adam optimizer, learning rate down
optimizer = optim.Adam(model.parameters(), lr=modellr)
def adjust_learning_rate(optimizer, epoch) :
"""Sets the learning rate to the initial LR decayed by 10 every 30 epochs"""
modellrnew = modellr * (0.1 ** (epoch // 50))
print("lr:", modellrnew)
for param_group in optimizer.param_groups:
param_group['lr'] = modellrnew
Copy the code
Set up training and validation
# Define the training process
def train(model, device, train_loader, optimizer, epoch) :
model.train()
sum_loss = 0
total_num = len(train_loader.dataset)
print(total_num, len(train_loader))
for batch_idx, (data, target) in enumerate(train_loader):
data, target = Variable(data).to(device), Variable(target).to(device)
output = model(data)
loss = criterion(output, target)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print_loss = loss.data.item()
sum_loss += print_loss
if (batch_idx + 1) % 50= =0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, (batch_idx + 1) * len(data), len(train_loader.dataset),
100. * (batch_idx + 1) / len(train_loader), loss.item()))
ave_loss = sum_loss / len(train_loader)
print('epoch:{},loss:{}'.format(epoch, ave_loss))
def val(model, device, test_loader) :
model.eval()
test_loss = 0
correct = 0
total_num = len(test_loader.dataset)
print(total_num, len(test_loader))
with torch.no_grad():
for data, target in test_loader:
data, target = Variable(data).to(device), Variable(target).to(device)
output = model(data)
loss = criterion(output, target)
_, pred = torch.max(output.data, 1)
correct += torch.sum(pred == target)
print_loss = loss.data.item()
test_loss += print_loss
correct = correct.data.item()
acc = correct / total_num
avgloss = test_loss / len(test_loader)
print('\nVal set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
avgloss, correct, len(test_loader.dataset), 100 * acc))
# training
for epoch in range(1, EPOCHS + 1) : adjust_learning_rate(optimizer, epoch) train(model, DEVICE, train_loader, optimizer, epoch) val(model, DEVICE, test_loader) torch.save(model,'model.pth')
Copy the code
This is the result of training with pre-training model, and a good result has been obtained with 1 EPOCH.
validation
The test set is stored in the following directory:
The first step is to define the category, the order of this category and the training of the category order corresponding, do not change the order !!!! When we train, cat is 0,dog is 1, so I define classes as (cat,dog).
Second, define transforms. Transforms are the same as the validation set’s transforms, without data enhancement.
Step 3 load the model and put it in DEVICE,
The fourth step is to read the Image and predict the category of the Image. Note here that Image is read using PIL library Image. Don’t use CV2, transforms is not supported.
import torch.utils.data.distributed
import torchvision.transforms as transforms
from torch.autograd import Variable
import os
from PIL import Image
classes = ('cat'.'dog')
transform_test = transforms.Compose([
transforms.Resize((224.224)),
transforms.ToTensor(),
transforms.Normalize([0.5.0.5.0.5], [0.5.0.5.0.5])
])
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load("model.pth")
model.eval()
model.to(DEVICE)
path='data/test/'
testList=os.listdir(path)
for file in testList:
img=Image.open(path+file)
img=transform_test(img)
img.unsqueeze_(0)
img = Variable(img).to(DEVICE)
out=model(img)
# Predict
_, pred = torch.max(out.data, 1)
print('Image Name:{},predict:{}'.format(file,classes[pred.data.item()]))
Copy the code
Running results:
In fact, you can also skillfully use the DATASets.ImageFolder, below we use the datasets.ImageFolder to achieve the prediction of images. Update the path of the test dataset and add another layer of file named dataset outside the test folder, as shown in the following figure:
Then modify the way the image is read. The code is as follows:
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.autograd import Variable
classes = ('cat'.'dog')
transform_test = transforms.Compose([
transforms.Resize((224.224)),
transforms.ToTensor(),
transforms.Normalize([0.5.0.5.0.5], [0.5.0.5.0.5])
])
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load("model.pth")
model.eval()
model.to(DEVICE)
dataset_test = datasets.ImageFolder('data/datatest', transform_test)
print(len(dataset_test))
# Label of the corresponding folder
for index in range(len(dataset_test)):
item = dataset_test[index]
img, label = item
img.unsqueeze_(0)
data = Variable(img).to(DEVICE)
output = model(data)
_, pred = torch.max(output.data, 1)
print('Image Name:{},predict:{}'.format(dataset_test.imgs[index][0], classes[pred.data.item()]))
index += 1
Copy the code
Complete code:
train.py
import torch.optim as optim
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.optim
import torch.utils.data
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models
from effnetv2 import effnetv2_s
from torch.autograd import Variable
# Set the hyperparameter
BATCH_SIZE = 16
EPOCHS = 10
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# Data preprocessing
transform = transforms.Compose([
transforms.Resize((128.128)),
# transforms.RandomVerticalFlip(),
# transforms.RandomCrop(50),
# transforms. ColorJitter (brightness = 0.5, contrast = 0.5, hue = 0.5).
transforms.ToTensor(),
transforms.Normalize([0.5.0.5.0.5], [0.5.0.5.0.5])
])
transform_test = transforms.Compose([
transforms.Resize((128.128)),
transforms.ToTensor(),
transforms.Normalize([0.5.0.5.0.5], [0.5.0.5.0.5]])# fetch data
dataset_train = datasets.ImageFolder('data/train', transform)
print(dataset_train.imgs)
# Label of the corresponding folder
print(dataset_train.class_to_idx)
dataset_test = datasets.ImageFolder('data/val', transform_test)
# Label of the corresponding folder
print(dataset_test.class_to_idx)
# import data
train_loader = torch.utils.data.DataLoader(dataset_train, batch_size=BATCH_SIZE, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset_test, batch_size=BATCH_SIZE, shuffle=False)
modellr = 1e-4
Instantiate the model and move it to the GPU
criterion = nn.CrossEntropyLoss()
# model = effnetv2_s()
# num_ftrs = model.classifier.in_features
# model.classifier = nn.Linear(num_ftrs, 2)
model = torchvision.models.resnet18(pretrained=False)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 2)
model.to(DEVICE)
# Choose simple violent Adam optimizer, learning rate down
optimizer = optim.Adam(model.parameters(), lr=modellr)
def adjust_learning_rate(optimizer, epoch) :
"""Sets the learning rate to the initial LR decayed by 10 every 30 epochs"""
modellrnew = modellr * (0.1 ** (epoch // 50))
print("lr:", modellrnew)
for param_group in optimizer.param_groups:
param_group['lr'] = modellrnew
# Define the training process
def train(model, device, train_loader, optimizer, epoch) :
model.train()
sum_loss = 0
total_num = len(train_loader.dataset)
print(total_num, len(train_loader))
for batch_idx, (data, target) in enumerate(train_loader):
data, target = Variable(data).to(device), Variable(target).to(device)
output = model(data)
loss = criterion(output, target)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print_loss = loss.data.item()
sum_loss += print_loss
if (batch_idx + 1) % 50= =0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, (batch_idx + 1) * len(data), len(train_loader.dataset),
100. * (batch_idx + 1) / len(train_loader), loss.item()))
ave_loss = sum_loss / len(train_loader)
print('epoch:{},loss:{}'.format(epoch, ave_loss))
def val(model, device, test_loader) :
model.eval()
test_loss = 0
correct = 0
total_num = len(test_loader.dataset)
print(total_num, len(test_loader))
with torch.no_grad():
for data, target in test_loader:
data, target = Variable(data).to(device), Variable(target).to(device)
output = model(data)
loss = criterion(output, target)
_, pred = torch.max(output.data, 1)
correct += torch.sum(pred == target)
print_loss = loss.data.item()
test_loss += print_loss
correct = correct.data.item()
acc = correct / total_num
avgloss = test_loss / len(test_loader)
print('\nVal set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
avgloss, correct, len(test_loader.dataset), 100 * acc))
# training
for epoch in range(1, EPOCHS + 1) : adjust_learning_rate(optimizer, epoch) train(model, DEVICE, train_loader, optimizer, epoch) val(model, DEVICE, test_loader) torch.save(model,'model.pth')
Copy the code
test1.py
import torch.utils.data.distributed
import torchvision.transforms as transforms
from torch.autograd import Variable
import os
from PIL import Image
classes = ('cat'.'dog')
transform_test = transforms.Compose([
transforms.Resize((224.224)),
transforms.ToTensor(),
transforms.Normalize([0.5.0.5.0.5], [0.5.0.5.0.5])
])
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load("model.pth")
model.eval()
model.to(DEVICE)
path='data/test/'
testList=os.listdir(path)
for file in testList:
img=Image.open(path+file)
img=transform_test(img)
img.unsqueeze_(0)
img = Variable(img).to(DEVICE)
out=model(img)
# Predict
_, pred = torch.max(out.data, 1)
print('Image Name:{},predict:{}'.format(file,classes[pred.data.item()]))
Copy the code
test2.py
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.autograd import Variable
classes = ('cat'.'dog')
transform_test = transforms.Compose([
transforms.Resize((224.224)),
transforms.ToTensor(),
transforms.Normalize([0.5.0.5.0.5], [0.5.0.5.0.5])
])
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load("model.pth")
model.eval()
model.to(DEVICE)
dataset_test = datasets.ImageFolder('data/datatest', transform_test)
print(len(dataset_test))
# Label of the corresponding folder
for index in range(len(dataset_test)):
item = dataset_test[index]
img, label = item
img.unsqueeze_(0)
data = Variable(img).to(DEVICE)
output = model(data)
_, pred = torch.max(output.data, 1)
print('Image Name:{},predict:{}'.format(dataset_test.imgs[index][0], classes[pred.data.item()]))
index += 1
Copy the code