Suck the cat with code! This paper is participating in[Cat Essay Campaign].
preface
This article introduces how to use resnet network to distinguish cat and dog images, the accuracy can reach 98%. Mom doesn’t have to worry about me not recognizing my cat anymore.
The MegEngine framework is installed
MegEngine is a one-stop “deep learning” model development platform to start growing your AI skills
The MegEngine framework can be downloaded by following the following command,
pip3 install megengine -f https://megengine.org.cn/whl/mge.html
Copy the code
At the same time, you can fork the public project to learn about the model
This month the campaign was code Suck cats, so I came to learn about the Cat and Dog Wars project on MegEngine.
Learning record
Project introduction
Cat and Dog Wars is a megengine platform that uses deep learning algorithms to distinguish cats and dogs. With Resnet, the accuracy rate is 98%.
Data preparation
Construct the DataSet based on the MegEngine DataSet
First introduce dependencies
from typing import Tuple
import numpy as np
from megengine.data.dataset import Dataset
import os
import cv2
Copy the code
1000 pictures were divided into training set and test set in a ratio of 9:1
class CatVsDogDataset(Dataset) :
def __init__(self, mode, dir) :
super().__init__()
self.mode = mode
self.dir = dir
self.data_size = 0
self.data = []
self.label = []
# self.data is the data, and self.label is the label corresponding to the data
if self.mode == 'train':
dir = os.path.join(dir."train")
for file in os.listdir(dir) :# to read file
img = cv2.imread(os.path.join(dir, file))
self.data.append(img)
name = file.split(sep='. ')
if name[0] = ='cat':
self.label.append(0) # the label of cat is 0
else:
self.label.append(1) # the label of dog is 1
elif self.mode == 'test':
dir = os.path.join(dir."test")
for file in os.listdir(dir):
img = cv2.imread(os.path.join(dir, file))
self.data.append(img)
name = file.split(sep='. ')
if name[0] = ='cat':
self.label.append(0) # the label of cat is 0
else:
self.label.append(1) # the label of dog is 1
else:
print('Undefined Dataset! ')
self.data = np.array(self.data)
self.label = np.array(self.label)
print(self.data.shape)
print(self.label.shape)
Define the method to get each sample in the dataset
def __getitem__(self, index: int) - >Tuple:
return self.data[index], self.label[index]
Define a method to return the length of the dataset
def __len__(self) - >int:
return len(self.data)
Copy the code
Check the partitioned data:
import os
print(Total number of training data sets:.len(os.listdir("./dataset/CatVsDog/train")))
print("Total test data sets :".len(os.listdir("./dataset/CatVsDog/test")))
train_dataset = CatVsDogDataset("train"."./dataset/CatVsDog")
test_dataset = CatVsDogDataset("test"."./dataset/CatVsDog")
Copy the code
Data preparation is complete
Build a ResNet network structure
What is the ResNet network architecture
The data output of one of the earlier layers is directly introduced into the input part of the later data layer by skipping the layers. This means that the content of the following feature layer will be contributed linearly by one of the preceding layers.
From experience, the depth of the network is crucial to the performance of the model. When the number of network layers is increased, the network can extract more complex feature patterns. Therefore, better results can be theoretically obtained when the model is deeper. However, when the number of network layers increases, deep network degradation occurs, which causes great obstacles to the progress of deep network.
The ResNet algorithm proposed by Dr. He solved the problem that CNN model was difficult to train. In 2014, VGC had only 19 layers, and 152 layers in 15 years, which also proved the superiority of ResNet.
The core code constructed references the megengine. Functional and module methods.
This algorithm is not easy to implement with pure python native handwriting, see the Open project for Megengine.
Megengine officially provides trained models that we can reference directly, we download the models directly, the current models have been trained and don’t need to be trained again.
os.system("wget https://data.megengine.org.cn/models/weights/resnet18_naiveaug_70312_78a63ca6.pkl")
Copy the code
Of course, the model can also continue to train.
Model training
def model_train() :
import megengine as mge
smallnet = resnet18()
# optional
state_dict = mge.load('resnet18_naiveaug_70312_78a63ca6.pkl')
smallnet.load_state_dict(state_dict)
batch_size = 16
sampler = RandomSampler(dataset=train_dataset, batch_size=batch_size, drop_last=True)
from megengine.data import transform
transform = transform.Compose([
transform.RandomResizedCrop(224),
transform.RandomHorizontalFlip(),
transform.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4),
transform.Lighting(0.1),
transform.Normalize(
mean=[103.530.116.280.123.675], std=[57.375.57.120.58.395]
),
transform.ToMode("CHW"),
])
train_dataloader = DataLoader(
train_dataset,
sampler=sampler,
transform=transform,
)
# Define static graph training function
@trace(symbolic=True)
def train_func(data, label, *, net, optimizer) :
net.train() Set the network to training mode
pred = net(data)
# Use cross entropy loss
loss = F.cross_entropy_with_softmax(pred, label)
optimizer.backward(loss)
return pred, loss
# Define optimizer
opt = optim.SGD(smallnet.parameters(), lr=0.001, momentum=0.9, weight_decay=1e-4)
# Model training
import megengine as mge
import numpy as np
# set trace.enabled=False if you want to run eager mode
# trace.enabled = False
Training iteration, optimizer updates parameters
For demonstration purposes, only 10 Epochs will be iterated.
Actual training could be set to 200 EPOchs, reducing LR to 0.01 and 0.001 at the 100th and 150th epochs locations, respectively.
epochs = 100
data_tensor = mge.tensor(dtype=np.float32)
label_tensor = mge.tensor(dtype=np.int32)
losses = []
for i in range(epochs):
print(".")
loss_rec = []
for data, label in train_dataloader:
Img = np.array(data[0]) img = np.transpose(img,[1,2,0]) print(imp.shape) cv2.imshow("img", img) cv2.waitKey(0) """
data_tensor.set_value(data)
label_tensor.set_value(label.astype("int32"))
opt.zero_grad()
# pred = smallnet(data)
# print(pred.shape)
# exit()
_, loss = train_func(data_tensor, label_tensor, net=smallnet, optimizer=opt)
opt.step()
loss_rec.append(loss.numpy().item())
loss = sum(loss_rec) / len(loss_rec)
losses.append(loss)
print("[Epoch {}] loss: {}".format(i, loss))
"" loss visualization model preservation ""
import matplotlib.pyplot as plt
plt.plot(range(len(losses)), losses, color='red')
plt.xlabel("iterator")
plt.ylabel('loss')
plt.show()
# Save the model
mge.save(smallnet.state_dict(), 'resnet18_static_100.mge')
Copy the code
Forgive xiao Bai, the training part of the model is not fully understood.
Model test
If you use the model provided by Megengine, you don’t need to write separate test functions, otherwise you have to write complex test functions like training, and this shows the power of Megavis.
def model_test() :
""" Model loading and testing :return: """
smallnet = resnet18()
import megengine as mge
state_dict = mge.load('resnet18_static_100.mge')
smallnet.load_state_dict(state_dict)
Create a DataLoader for testing
from megengine.data import transform
batch_size = 1
sampler_test = SequentialSampler(dataset=test_dataset, batch_size=batch_size)
transform_test = transform.Compose([
transform.Resize(256),
transform.CenterCrop(224),
transform.Normalize(
mean=[103.530.116.280.123.675], std=[57.375.57.120.58.395]),# BGR
transform.ToMode("CHW"),
])
test_dataloader = DataLoader(
test_dataset,
sampler=sampler_test,
transform=transform_test,
)
Define static graph test function to test the model
@trace(symbolic=True)
def eval_func(data, label, *, net) :
net.eval(a)Set the network to test mode
pred = net(data)
loss = F.cross_entropy_with_softmax(pred, label)
return pred, loss
data_tensor = mge.tensor()
label_tensor = mge.tensor(dtype=np.int32)
correct = 0
total = 0
for data, label in test_dataloader:
label = label.astype("int32")
pred, _ = eval_func(data, label, net=smallnet)
pred_label = F.argmax(pred, axis=1)
# if(pred_label.numpy()[0]! =label[0]):
# img = np.array(data[0])
# img = np.transpose(img, [1, 2, 0])
# print(img.shape)
# cv2.imshow("img", img)
# cv2.waitKey(0)
correct += (pred_label == label).sum().numpy().item()
total += label.shape[0]
print("correct: {}, total: {}, accuracy: {:.2f}%".format(correct, total, correct * 100.0 / total))
Copy the code
Matters needing attention
- The model provided by MegStudio is from imagenet 1000 classification, which can be improved to 2 classification and load the pre-training model
- MegStudio is a bit slow to train, so I recommend copying the code to your local machine
Thank you
- References: Megengine Cat and Dog Wars, MegStudio
- Thank you for inviting my buddy Battlefield bag and Posting its home page (juejin.cn/user/442409…).