The article source | turbine cloud community
The original address | data set of actual combat
The original author | Mathor
To be honest, I’m a big fan of Mathor in the platform community! This is not, today again to share big guy paper notes to everyone, hurriedly see the next content whether you need knowledge points!
Beginning of the body:
For those of you who are not familiar with ResNet, read this blog postResNet paper reading
First a Residual Block is implemented
import torch from torch import nn from torch.nn import functional as F class ResBlk(nn.Module): def __init__(self, ch_in, ch_out, stride=1): super(ResBlk, self).__init__() self.conv1 = nn.Conv2d(ch_in, ch_out, kernel_size=3, stride=stride, padding=1) self.bn1 = nn.BatchNorm2d(ch_out) self.conv2 = nn.Conv2d(ch_out, ch_out, kernel_size=3, stride=1, padding=1) self.bn2 = nn.BatchNorm2d(ch_out) if ch_out == ch_in: self.extra = nn.Sequential() else: Self. Extra = nn.Sequential(# 1×1 convolve is to modify the channel of input x # [b, ch_in, h, w] => [b, ch_out, h, w] nn. kernel_size=1, stride=stride), nn.BatchNorm2d(ch_out), ) def forward(self, x): out = F.relu(self.bn1(self.conv1(x))) out = self.bn2(self.conv2(out)) # short cut out = self.extra(x) + out out = F.relu(out) return outCopy the code
Regularization is performed in the Block to make the train process faster and more stable. Also, if the ch_in and ch_out of the two elements don’t match, you’ll get an error when you add them, so you need to make a judgment, and if you don’t want to wait, you can adjust it with a 1 by 1 convolution
Test the
blk = ResBlk(64, 128, stride=2)
tmp = torch.randn(2, 64, 32, 32)
out = blk(tmp)
print(out.shape)
Copy the code
The output shape Size is torch.Size([2, 128, 16, 16])
Here we explain why some layers need to set the stride specially. For a Residual block, the channel increased from 64 to 128. If all the stride was 1 and the padding was 1, the W and H of the image would not change, but the channel increased, which would lead to an increase in the parameters of the whole network. This is only one Block, not to mention FC and more blocks, so the stride cannot all be set to 1, do not keep increasing the network parameters
Then we built the complete ResNET-18
class ResNet18(nn.Module): def __init__(self): super(ResNet18, self).__init__() self.conv1 = nn.Sequential( nn.Conv2d(3, 64, kernel_size=3, stride=3, padding=0), nn.BatchNorm2d(64), ) # followed 4 blocks # [b, 64, h, w] => [b, 128, h, w] self.blk1 = ResBlk(64, 128, stride=2) # [b, 128, h, w] => [b, 256, h, w] self.blk2 = ResBlk(128, 256, stride=2) # [b, 256, h, w] => [b, 512, h, w] self.blk3 = ResBlk(256, 512, stride=2) # [b, 512, h, w] => [b, 512, h, w] self.blk4 = ResBlk(512, 512, stride=2) self.outlayer = nn.Linear(512*1*1, 10) def forward(self, x): X = f.conv1(self.conv1(x)) # after four BLKS [b, 64, h, w] => [b, 512, h, w] x = self.blk1(x) x = self.blk2(x) x = self.blk3(x) x = self.blk4(x) x = self.outlayer(x) return xCopy the code
Test the
x = torch.randn(2, 3, 32, 32)
model = ResNet18()
out = model(x)
print("ResNet:", out.shape)
Copy the code
The result is an error with the following error message
size mismatch, m1: [2048 x 2], m2: [512 x 10] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:961
Copy the code
Size([2, 512, 2, 2]); shape ([2, 512, 2]);
You can either modify the input of the linear layer to match, or you can do something after the last Block to match 512
First give the modified code, in the explanation
class ResNet18(nn.Module): def __init__(self): super(ResNet18, self).__init__() self.conv1 = nn.Sequential( nn.Conv2d(3, 64, kernel_size=3, stride=3, padding=0), nn.BatchNorm2d(64), ) # followed 4 blocks # [b, 64, h, w] => [b, 128, h, w] self.blk1 = ResBlk(64, 128, stride=2) # [b, 128, h, w] => [b, 256, h, w] self.blk2 = ResBlk(128, 256, stride=2) # [b, 256, h, w] => [b, 512, h, w] self.blk3 = ResBlk(256, 512, stride=2) # [b, 512, h, w] => [b, 512, h, w] self.blk4 = ResBlk(512, 512, stride=2) self.outlayer = nn.Linear(512*1*1, 10) def forward(self, x): X = f.conv1(self.conv1(x)) # after four BLKS [b, 64, h, w] => [b, 512, h, w] x = self.blk1(x) x = self.blk2(x) x = self.blk3(x) x = self.blk4(x) # print("after conv:", x.shape) # [b, 512, 2, 2] # [b, 512, h, w] => [b, 512, 1, 1] x = F.adaptive_avg_pool2d(x, [1, 1]) x = x.view(x.size(0), -1) # [b, 512, 1, 1] => [b, 512*1*1] x = self.outlayer(x) return xCopy the code
So what I’ve done here is I’ve done the second method, and then after the last Block, I’ve done an adaptive pooling layer, and what that does is take the tensor that whatever the input width and height is, the output is always going to be one, all the other dimensions stay the same. And then do a reshape the operation, will reshape [batchsize, 512, 1, 1] to [batchsize, 512 * 1 * 1) the size of a tensor, such as linear layer on the top, the next layer linear input size is 512, the output is 10. So the final output shape of the entire network is [batchsize, 10]
Model =LeNet5(); model=ResNet18(); LeNet5(); The complete code is as follows
import torch from torch import nn, optim import torch.nn.functional as F from torch.utils.data import DataLoader from torchvision import datasets, transforms batch_size=32 cifar_train = datasets.CIFAR10(root='cifar', train=True, transform=transforms.Compose([ transforms.Resize([32, 32]), transforms.ToTensor(), ]), download=True) cifar_train = DataLoader(cifar_train, batch_size=batch_size, shuffle=True) cifar_test = datasets.CIFAR10(root='cifar', train=False, transform=transforms.Compose([ transforms.Resize([32, 32]), transforms.ToTensor(), ]), download=True) cifar_test = DataLoader(cifar_test, batch_size=batch_size, shuffle=True) class ResBlk(nn.Module): def __init__(self, ch_in, ch_out, stride=1): super(ResBlk, self).__init__() self.conv1 = nn.Conv2d(ch_in, ch_out, kernel_size=3, stride=stride, padding=1) self.bn1 = nn.BatchNorm2d(ch_out) self.conv2 = nn.Conv2d(ch_out, ch_out, kernel_size=3, stride=1, padding=1) self.bn2 = nn.BatchNorm2d(ch_out) if ch_out == ch_in: self.extra = nn.Sequential() else: Self. Extra = nn.Sequential(# 1×1 convolve is to modify the channel of input x # [b, ch_in, h, w] => [b, ch_out, h, w] nn. kernel_size=1, stride=stride), nn.BatchNorm2d(ch_out), ) def forward(self, x): out = F.relu(self.bn1(self.conv1(x))) out = self.bn2(self.conv2(out)) # short cut out = self.extra(x) + out out = F.relu(out) return out class ResNet18(nn.Module): def __init__(self): super(ResNet18, self).__init__() self.conv1 = nn.Sequential( nn.Conv2d(3, 64, kernel_size=3, stride=3, padding=0), nn.BatchNorm2d(64), ) # followed 4 blocks # [b, 64, h, w] => [b, 128, h, w] self.blk1 = ResBlk(64, 128, stride=2) # [b, 128, h, w] => [b, 256, h, w] self.blk2 = ResBlk(128, 256, stride=2) # [b, 256, h, w] => [b, 512, h, w] self.blk3 = ResBlk(256, 512, stride=2) # [b, 512, h, w] => [b, 512, h, w] self.blk4 = ResBlk(512, 512, stride=2) self.outlayer = nn.Linear(512*1*1, 10) def forward(self, x): X = f.conv1(self.conv1(x)) # after four BLKS [b, 64, h, w] => [b, 512, h, w] x = self.blk1(x) x = self.blk2(x) x = self.blk3(x) x = self.blk4(x) # print("after conv:", x.shape) # [b, 512, 2, 2] # [b, 512, h, w] => [b, 512, 1, 1] x = F.adaptive_avg_pool2d(x, [1, 1]) x = x.view(x.size(0), -1) # [b, 512, 1, 1] => [b, 512*1*1] x = self.outlayer(x) return x def main(): ########## train ########## #device = torch.device('cuda') #model = ResNet18().to(device) criteon = nn.CrossEntropyLoss() model = ResNet18() optimizer = optim.Adam(model.parameters(), 1e-3) for epoch in range(1000): model.train() for batchidx, (x, label) in enumerate(cifar_train): #x, label = x.to(device), label.to(device) logits = model(x) # logits: [b, 10] # label: [b] loss = criteon(logits, label) # backward optimizer.zero_grad() loss.backward() optimizer.step() print('train:', epoch, loss.item()) ########## test ########## model.eval() with torch.no_grad(): total_correct = 0 total_num = 0 for x, label in cifar_test: # x, label = x.to(device), label.to(device) # [b] logits = model(x) # [b] pred = logits.argmax(dim=1) # [b] vs [b] total_correct += torch.eq(pred, label).float().sum().item() total_num += x.size(0) acc = total_correct / total_num print('test:', epoch, acc) if __name__ == '__main__': main()Copy the code
Compared with LeNet, the accuracy of ResNet is improved rapidly. However, due to the increase in the number of layers, the running time will inevitably increase. Without GPU, it will take about 15 minutes to run an epoch. Readers can also modify the network structure based on this, using tricks, such as Normalize the image in the beginning