Suck the cat with code! This paper is participating in[Cat Essay Campaign].

preface

This article introduces how to use resnet network to distinguish cat and dog images, the accuracy can reach 98%. Mom doesn’t have to worry about me not recognizing my cat anymore.

The MegEngine framework is installed

MegEngine is a one-stop “deep learning” model development platform to start growing your AI skills

The MegEngine framework can be downloaded by following the following command,

pip3 install megengine -f https://megengine.org.cn/whl/mge.html
Copy the code

At the same time, you can fork the public project to learn about the model

This month the campaign was code Suck cats, so I came to learn about the Cat and Dog Wars project on MegEngine.

Learning record

Project introduction

Cat and Dog Wars is a megengine platform that uses deep learning algorithms to distinguish cats and dogs. With Resnet, the accuracy rate is 98%.

Data preparation

Construct the DataSet based on the MegEngine DataSet

First introduce dependencies

from typing import Tuple
import numpy as np
from megengine.data.dataset import Dataset
import os
import cv2
Copy the code

1000 pictures were divided into training set and test set in a ratio of 9:1

class CatVsDogDataset(Dataset) :
    def __init__(self, mode, dir) :
        super().__init__()
        self.mode = mode
        self.dir = dir
        self.data_size = 0
        self.data = []
        self.label = []
        # self.data is the data, and self.label is the label corresponding to the data
        if self.mode == 'train':
            dir = os.path.join(dir."train")
            for file in os.listdir(dir) :# to read file
                img = cv2.imread(os.path.join(dir, file))
                self.data.append(img)
                name = file.split(sep='. ')
                if name[0] = ='cat':
                    self.label.append(0)  # the label of cat is 0
                else:
                    self.label.append(1)  # the label of dog is 1
        elif self.mode == 'test':
            dir = os.path.join(dir."test")
            for file in os.listdir(dir):
                img = cv2.imread(os.path.join(dir, file))
                self.data.append(img)
                name = file.split(sep='. ')
                if name[0] = ='cat':
                    self.label.append(0)  # the label of cat is 0
                else:
                    self.label.append(1)  # the label of dog is 1
        else:
            print('Undefined Dataset! ')
        self.data = np.array(self.data)
        self.label = np.array(self.label)
        print(self.data.shape)
        print(self.label.shape)

    Define the method to get each sample in the dataset
    def __getitem__(self, index: int) - >Tuple:
        return self.data[index], self.label[index]

    Define a method to return the length of the dataset
    def __len__(self) - >int:
        return len(self.data)
Copy the code

Check the partitioned data:

import os
print(Total number of training data sets:.len(os.listdir("./dataset/CatVsDog/train")))
print("Total test data sets :".len(os.listdir("./dataset/CatVsDog/test")))
train_dataset = CatVsDogDataset("train"."./dataset/CatVsDog")
test_dataset = CatVsDogDataset("test"."./dataset/CatVsDog")
Copy the code

Data preparation is complete

Build a ResNet network structure

What is the ResNet network architecture

The data output of one of the earlier layers is directly introduced into the input part of the later data layer by skipping the layers. This means that the content of the following feature layer will be contributed linearly by one of the preceding layers.

From experience, the depth of the network is crucial to the performance of the model. When the number of network layers is increased, the network can extract more complex feature patterns. Therefore, better results can be theoretically obtained when the model is deeper. However, when the number of network layers increases, deep network degradation occurs, which causes great obstacles to the progress of deep network.

The ResNet algorithm proposed by Dr. He solved the problem that CNN model was difficult to train. In 2014, VGC had only 19 layers, and 152 layers in 15 years, which also proved the superiority of ResNet.

The core code constructed references the megengine. Functional and module methods.

This algorithm is not easy to implement with pure python native handwriting, see the Open project for Megengine.

Megengine officially provides trained models that we can reference directly, we download the models directly, the current models have been trained and don’t need to be trained again.

os.system("wget https://data.megengine.org.cn/models/weights/resnet18_naiveaug_70312_78a63ca6.pkl")

Copy the code

Of course, the model can also continue to train.

Model training

def model_train() :
    import megengine as mge
    smallnet = resnet18()
    # optional
    state_dict = mge.load('resnet18_naiveaug_70312_78a63ca6.pkl')
    smallnet.load_state_dict(state_dict)
    batch_size = 16
    sampler = RandomSampler(dataset=train_dataset, batch_size=batch_size, drop_last=True)
    from megengine.data import transform
    transform = transform.Compose([
        transform.RandomResizedCrop(224),
        transform.RandomHorizontalFlip(),
        transform.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4),
        transform.Lighting(0.1),
        transform.Normalize(
            mean=[103.530.116.280.123.675], std=[57.375.57.120.58.395]
        ),  
        transform.ToMode("CHW"),
    ])
    train_dataloader = DataLoader(
        train_dataset,
        sampler=sampler,
        transform=transform,
    )

    # Define static graph training function
    @trace(symbolic=True)
    def train_func(data, label, *, net, optimizer) :
        net.train()  Set the network to training mode
        pred = net(data)
        # Use cross entropy loss
        loss = F.cross_entropy_with_softmax(pred, label)
        optimizer.backward(loss)
        return pred, loss

    # Define optimizer
    opt = optim.SGD(smallnet.parameters(), lr=0.001, momentum=0.9, weight_decay=1e-4)
    # Model training
    import megengine as mge
    import numpy as np

    # set trace.enabled=False if you want to run eager mode
    # trace.enabled = False

    Training iteration, optimizer updates parameters
    For demonstration purposes, only 10 Epochs will be iterated.
    Actual training could be set to 200 EPOchs, reducing LR to 0.01 and 0.001 at the 100th and 150th epochs locations, respectively.
    epochs = 100
    data_tensor = mge.tensor(dtype=np.float32)
    label_tensor = mge.tensor(dtype=np.int32)
    losses = []
    for i in range(epochs):
        print(".")
        loss_rec = []
        for data, label in train_dataloader:
            Img = np.array(data[0]) img = np.transpose(img,[1,2,0]) print(imp.shape) cv2.imshow("img", img) cv2.waitKey(0) """
            data_tensor.set_value(data)
            label_tensor.set_value(label.astype("int32"))
            opt.zero_grad()
            # pred = smallnet(data)
            # print(pred.shape)
            # exit()
            _, loss = train_func(data_tensor, label_tensor, net=smallnet, optimizer=opt)
            opt.step()
            loss_rec.append(loss.numpy().item())
        loss = sum(loss_rec) / len(loss_rec)
        losses.append(loss)
        print("[Epoch {}] loss: {}".format(i, loss))
    "" loss visualization model preservation ""
    import matplotlib.pyplot as plt
    plt.plot(range(len(losses)), losses, color='red')
    plt.xlabel("iterator")
    plt.ylabel('loss')
    plt.show()
    # Save the model
    mge.save(smallnet.state_dict(), 'resnet18_static_100.mge')
Copy the code

Forgive xiao Bai, the training part of the model is not fully understood.

Model test

If you use the model provided by Megengine, you don’t need to write separate test functions, otherwise you have to write complex test functions like training, and this shows the power of Megavis.

def model_test() :
    """ Model loading and testing :return: """
    smallnet = resnet18()
    import megengine as mge
    state_dict = mge.load('resnet18_static_100.mge')
    smallnet.load_state_dict(state_dict)
    Create a DataLoader for testing
    from megengine.data import transform

    batch_size = 1
    sampler_test = SequentialSampler(dataset=test_dataset, batch_size=batch_size)

    transform_test = transform.Compose([
        transform.Resize(256),
        transform.CenterCrop(224),
        transform.Normalize(
            mean=[103.530.116.280.123.675], std=[57.375.57.120.58.395]),# BGR
        transform.ToMode("CHW"),
    ])

    test_dataloader = DataLoader(
        test_dataset,
        sampler=sampler_test,
        transform=transform_test,
    )

    Define static graph test function to test the model
    @trace(symbolic=True)
    def eval_func(data, label, *, net) :
        net.eval(a)Set the network to test mode
        pred = net(data)
        loss = F.cross_entropy_with_softmax(pred, label)
        return pred, loss

    data_tensor = mge.tensor()
    label_tensor = mge.tensor(dtype=np.int32)
    correct = 0
    total = 0
    for data, label in test_dataloader:
        label = label.astype("int32")
        pred, _ = eval_func(data, label, net=smallnet)
        pred_label = F.argmax(pred, axis=1)
        # if(pred_label.numpy()[0]! =label[0]):
        # img = np.array(data[0])
        # img = np.transpose(img, [1, 2, 0])
        # print(img.shape)
        # cv2.imshow("img", img)
        # cv2.waitKey(0)
        correct += (pred_label == label).sum().numpy().item()
        total += label.shape[0]

    print("correct: {}, total: {}, accuracy: {:.2f}%".format(correct, total, correct * 100.0 / total))
Copy the code

Matters needing attention

  1. The model provided by MegStudio is from imagenet 1000 classification, which can be improved to 2 classification and load the pre-training model
  2. MegStudio is a bit slow to train, so I recommend copying the code to your local machine

Thank you

  • References: Megengine Cat and Dog Wars, MegStudio
  • Thank you for inviting my buddy Battlefield bag and Posting its home page (juejin.cn/user/442409…).