This is the 29th day of my participation in the First Challenge 2022

ResNet: TensorFlow2.x version, ResNet50 Image Classification Task (large data set)

Abstract

This example extracts part of the data of the plant seedling data set as a data set. The data set has 12 categories in total. Today, I realize the image classification task of tensorflow2.x version with everyone, and the classification model uses ResNet50. The algorithm implemented in this paper has the following characteristics:

1. Customized image loading method, more flexible and efficient, no need to load the image into memory at one time, saving memory, suitable for large-scale data sets.

2. Load the pre-training weight of the model, and the training time is shorter.

3. Use albumentations for data enhancement.

training

The first step is to import the required data packets and set global parameters

import numpy as np
from tensorflow.keras.optimizers import Adam
import cv2
from tensorflow.keras.preprocessing.image import img_to_array
from sklearn.model_selection import train_test_split
from tensorflow.python.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.applications import ResNet50
import os
import tensorflow as tf
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.models import Sequential
import albumentations

norm_size = 224
datapath = 'data/train'
EPOCHS = 20
INIT_LR = 3e-4
labelList = []
dicClass = {'Black-grass': 0.'Charlock': 1.'Cleavers': 2.'Common Chickweed': 3.'Common wheat': 4.'Fat Hen': 5.'Loose Silky-bent': 6.'Maize': 7.'Scentless Mayweed': 8.'Shepherds Purse': 9.'Small-flowered Cranesbill': 10.'Sugar beet': 11}
classnum = 12
batch_size = 4
np.random.seed(42)
Copy the code

Keras is integrated with tensorFlow 2.0 and later, so we do not need to install Keras separately. We can upgrade the previous code to TensorFlow 2.0 and later by adding TensorFlow in front of Keras.

With tensorflow out of the way, there are a few more important global parameters:

  • Norm_size = 224 Sets the size of the input image. ResNet50 default image size is 224 x 224.

  • Datapath = ‘data/train’ set the path to save the image. If there are many images, do not place them in the project directory, otherwise Pycharm will scan all the images when loading the project, which is very slow.

  • EPOCHS = 20. It is a puzzle how much EPOCHS is appropriate to set. In general, setting 300 is enough.

  • INIT_LR = 1E-3 learning rate, generally from 0.001 gradually reduced, also do not too small to 1E-6 can be used.

  • Classnum = 12 number of categories, the dataset has two categories, so it is divided into two categories.

  • Batch_size = 4 Batchsize. Based on hardware conditions and data set size Settings, the value is too small. The loss float is too large. Windows allows you to view the video memory usage in the task manager.

    Ubuntu can use Nvidia-SMI to check video memory usage.

  • Define the random factor of numpy. Random. So we can fix the random index

Step 2 Load the image

Instead of processing images, we return only a list of image paths.

See the code for specific practices:

def loadImageData() :
    imageList = []
    listClasses = os.listdir(datapath)  # Category folder
    print(listClasses)
    for class_name in listClasses:
        label_id = dicClass[class_name]
        class_path = os.path.join(datapath, class_name)
        image_names = os.listdir(class_path)
        for image_name in image_names:
            image_full_path = os.path.join(class_path, image_name)
            labelList.append(label_id)
            imageList.append(image_full_path)
    return imageList


print("Start loading data")
imageArr = loadImageData()
labelList = np.array(labelList)
print("Loading data completed")
Copy the code

Once the data is ready, we need to split the training set and the test set, usually in a 4:1 or 7:3 ratio. To split the dataset using the train_test_split() method, import the package train_test_split from sklearn.model_selection import. Ex. :

trainX, valX, trainY, valY = train_test_split(imageArr, labelList, test_size=0.2, random_state=42)
Copy the code

Step 3 image enhancement

train_transform = albumentations.Compose([
        albumentations.OneOf([
            albumentations.RandomGamma(gamma_limit=(60.120), p=0.9),
            albumentations.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.9),
            albumentations.CLAHE(clip_limit=4.0, tile_grid_size=(4.4), p=0.9),
        ]),
        albumentations.HorizontalFlip(p=0.5),
        albumentations.ShiftScaleRotate(shift_limit=0.2, scale_limit=0.2, rotate_limit=20,
                                        interpolation=cv2.INTER_LINEAR, border_mode=cv2.BORDER_CONSTANT, p=1),
        albumentations.Normalize(mean=(0.485.0.456.0.406), std=(0.229.0.224.0.225), max_pixel_value=255.0, p=1.0)
    ])
val_transform = albumentations.Compose([
        albumentations.Normalize(mean=(0.485.0.456.0.406), std=(0.229.0.224.0.225), max_pixel_value=255.0, p=1.0)])Copy the code

This is arbitrary, the specific Settings can refer to the article I wrote before:

Image enhancement library Albumentations using summary _AIhao -CSDN blog _albumentations

Two data enhancements were written, one for training and one for validation. The verification set only needs to normalize the image.

The fourth step defines the method of image processing

The main purpose of the generator is to process images and iteratively return a batch of images with corresponding labels.

Ideas:

In the while loop:

  • Initialize input_samples and input_labels with a list to store image and its corresponding labels, respectively.

  • Batch_size number of cycles:

    • Random index
    • From file_pathList and labels, get the path and label of the image
    • Read the pictures
    • If it is trained, train the transform; if it is not, perform the validated transform.
    • Resize images
    • Convert image to array
    • Put the image and label into input_samples and input_Labels, respectively
  • Convert list to numpy array.

  • Return an iteration

def generator(file_pathList,labels,batch_size,train_action=False) :
    L = len(file_pathList)
    while True:
        input_labels = []
        input_samples = []
        for row in range(0, batch_size):
            temp = np.random.randint(0, L)
            X = file_pathList[temp]
            Y = labels[temp]
            image = cv2.imdecode(np.fromfile(X, dtype=np.uint8), -1)
            if image.shape[2] > 3:
                image = image[:, :, :3]
            if train_action:
                image=train_transform(image=image)['image']
            else:
                image = val_transform(image=image)['image'] image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4) image = img_to_array(image) input_samples.append(image) input_labels.append(Y) batch_x  = np.asarray(input_samples) batch_y = np.asarray(input_labels)yield (batch_x, batch_y)
Copy the code

Step 5 Retain the best model and dynamically set the learning rate

ModelCheckpoint: Used to save the model with the best performance.

The syntax is as follows:

keras.callbacks.ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1)
Copy the code

This callback will save the model to Filepath after each epoch

Filepath can be a formatted string whose placeholders will be filled with the EPOCH value and logs keyword passed to on_EPOch_end

For example, if filepath is weights.{epoch:02d-{val_loss:.2f}}.hdf5, multiple files corresponding to the epoch and validation set Loss are generated.

parameter

  • Filename: indicates the path to save the model
  • Monitor: Indicates the value to be monitored
  • Verbose: Indicates the information display mode. The value is 0 or 1
  • Save_best_only: When set to True, only the best-performing models on the validation set are saved
  • Mode: one of ‘auto’, ‘min’, ‘Max’, which determines the evaluation criteria for the best performance model when save_BEST_only =True. For example, when the monitoring value is val_ACC, the mode should be Max, and when the monitoring value is val_loss, the mode should be min. In auto mode, the evaluation criteria are automatically inferred from the name of the monitored value.
  • Save_weights_only: if set to True, only model weights are saved, otherwise the whole model (including structure, configuration information, etc.) is saved
  • Period: indicates the epoch number between CheckPoint points

ReduceLROnPlateau: Reduce the learning rate when the evaluation index is not improved, the syntax is as follows:

keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto', epsilon=0.0001, cooldown=0, min_lr=0)
Copy the code

When learning stagnates, it is often better to reduce the rate of learning by a factor of two or ten. This callback function detects the case of an indicator, and if no improvement in model performance is seen in patience with the epoch, the learning rate is reduced

parameter

  • Monitor: Indicates the monitored quantity
  • Factor: Each time the factor of learning rate is reduced, the learning rate will be reduced in the form of LR = LR *factor
  • Patience: When the epoch passed and model performance did not improve, the reduced learning rate was triggered
  • Mode: one of ‘auto’, ‘min’, ‘Max’. In min mode, if the detection value triggers the learning rate reduction. In Max mode, when the detection value does not rise, the learning rate is triggered to decrease.
  • Epsilon: threshold used to determine whether to enter the “plain area” of the tested value
  • Cooldown: After the learning rate decreases, the cooldown epoch will be used for normal operation again
  • Min_lr: lower limit of learning rate

The code for this example is as follows:

checkpointer = ModelCheckpoint(filepath='best_model.hdf5',
                               monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')

reduce = ReduceLROnPlateau(monitor='val_accuracy', patience=10,
                           verbose=1,
                           factor=0.5,
                           min_lr=1e-6)
Copy the code

Step 6 Model and training

model = Sequential()
model.add(ResNet50(include_top=False, pooling='avg', weights='imagenet'))
model.add(Dense(classnum, activation='softmax'))
optimizer = Adam(learning_rate=INIT_LR)
model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(generator(trainX,trainY,batch_size,train_action=True),
                              steps_per_epoch=len(trainX) / batch_size,
                              validation_data=generator(valX,valY,batch_size,train_action=False),
                              epochs=EPOCHS,
                              validation_steps=len(valX) / batch_size,
                              callbacks=[checkpointer, reduce])
model.save('my_model.h5')
print(history)
Copy the code

The pre-training model was not used in the last blog, and there were some mistakes in this one. After consulting the data, we found that this method was wrong, as follows:

# model = ResNet50 (weights = "imagenet", input_shape = (224224, 3), include_top = False, Classes =classnum) #include_top=False Remove the last full connection layerCopy the code

If you want to specify classes, there are two conditions: include_top: True, weights: None. Classes cannot be specified otherwise.

So classes can’t be pre-trained, so there’s another way:

model = Sequential()
model.add(ResNet50(include_top=False, pooling='avg', weights='imagenet'))
model.add(Dense(classnum, activation='softmax'))
Copy the code

In addition, fit_generator was used in the previous article. In the new version, FIT supports generator mode, so it is changed to FIT.

Step 6 Preserve the training results and generate pictures

loss_trend_graph_path = r"WW_loss.jpg"
acc_trend_graph_path = r"WW_acc.jpg"
import matplotlib.pyplot as plt

print("Now,we start drawing the loss and acc trends graph...")
# summarize history for accuracy
fig = plt.figure(1)
plt.plot(history.history["accuracy"])
plt.plot(history.history["val_accuracy"])
plt.title("Model accuracy")
plt.ylabel("accuracy")
plt.xlabel("epoch")
plt.legend(["train"."test"], loc="upper left")
plt.savefig(acc_trend_graph_path)
plt.close(1)
# summarize history for loss
fig = plt.figure(2)
plt.plot(history.history["loss"])
plt.plot(history.history["val_loss"])
plt.title("Model loss")
plt.ylabel("loss")
plt.xlabel("epoch")
plt.legend(["train"."test"], loc="upper left")
plt.savefig(loss_trend_graph_path)
plt.close(2)
print("We are done, everything seems OK...")
# # Windows system setup 10 shutdown
#os.system("shutdown -s -t 10")
Copy the code

The test part

Single picture prediction

1. Import dependencies

import cv2
import numpy as np
from tensorflow.keras.preprocessing.image import img_to_array
from  tensorflow.keras.models import load_model
import time
import os
import albumentations
Copy the code

2. Set global parameters

Notice that the order of the dictionary is the same as the order of the training

norm_size=224
imagelist=[]
emotion_labels = {
    0: 'Black-grass'.1: 'Charlock'.2: 'Cleavers'.3: 'Common Chickweed'.4: 'Common wheat'.5: 'Fat Hen'.6: 'Loose Silky-bent'.7: 'Maize'.8: 'Scentless Mayweed'.9: 'Shepherds Purse'.10: 'Small-flowered Cranesbill'.11: 'Sugar beet',}Copy the code

3. Set image normalization parameters

The normalized parameter Settings are consistent with the validation parameters

Val_transform = albumentations.Compose([albumentations.Normalize(mean=(0.485, 0.456, 0.406), STD =(0.229, 0.224, 0.225), max_pixel_value = 255.0, p = 1.0)])Copy the code

3. Load the model

emotion_classifier=load_model("my_model.h5")
Copy the code

4. Manipulate images

The logic for processing images is similar to that for training sets.

  • Read the pictures
  • Resize the image to norm_size×norm_size.
  • Convert images to arrays.
  • Put it in an imagelist.
  • Convert list to numpy array.
image = cv2.imdecode(np.fromfile('data/test/0a64e3e6c.png', dtype=np.uint8), -1)
image = val_transform(image=image)['image']
image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
image = img_to_array(image)
imagelist.append(image)
imageList = np.array(imagelist, dtype="float")

Copy the code

5. Forecast categories

Predict categories and get the index of the highest category.

pre=np.argmax(emotion_classifier.predict(imageList))
emotion = emotion_labels[pre]
t2=time.time()
print(emotion)
t3=t2-t1
print(t3)
Copy the code

Batch predict

The difference between batch forecasting and sheet forecasting mainly lies in the reading of data and the processing of prediction categories after the prediction is completed. Nothing else has changed.

Steps:

  • Load the model.
  • A directory that defines test sets
  • Get the picture under the directory
  • Loop loop picture
    • Read the pictures
    • Normalize the image.
    • Resize images
    • Turn an array
    • In the imageList
  • To predict
predict_dir = 'data/test'
test11 = os.listdir(predict_dir)
for file in test11:
    filepath=os.path.join(predict_dir,file)

    image = cv2.imdecode(np.fromfile(filepath, dtype=np.uint8), -1)
    image = val_transform(image=image)['image']
    image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
    image = img_to_array(image)
    imagelist.append(image)
imageList = np.array(imagelist, dtype="float")
out = emotion_classifier.predict(imageList)
print(out)
pre = [np.argmax(i) for i in out]

class_name_list=[emotion_labels[i] for i in pre]
print(class_name_list)
t2 = time.time()
t3 = t2 - t1
print(t3)
Copy the code

The complete code: download.csdn.net/download/hh…