This is the 29th day of my participation in the First Challenge 2022
ResNet: TensorFlow2.x version, ResNet50 Image Classification Task (large data set)
Abstract
This example extracts part of the data of the plant seedling data set as a data set. The data set has 12 categories in total. Today, I realize the image classification task of tensorflow2.x version with everyone, and the classification model uses ResNet50. The algorithm implemented in this paper has the following characteristics:
1. Customized image loading method, more flexible and efficient, no need to load the image into memory at one time, saving memory, suitable for large-scale data sets.
2. Load the pre-training weight of the model, and the training time is shorter.
3. Use albumentations for data enhancement.
training
The first step is to import the required data packets and set global parameters
import numpy as np
from tensorflow.keras.optimizers import Adam
import cv2
from tensorflow.keras.preprocessing.image import img_to_array
from sklearn.model_selection import train_test_split
from tensorflow.python.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.applications import ResNet50
import os
import tensorflow as tf
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.models import Sequential
import albumentations
norm_size = 224
datapath = 'data/train'
EPOCHS = 20
INIT_LR = 3e-4
labelList = []
dicClass = {'Black-grass': 0.'Charlock': 1.'Cleavers': 2.'Common Chickweed': 3.'Common wheat': 4.'Fat Hen': 5.'Loose Silky-bent': 6.'Maize': 7.'Scentless Mayweed': 8.'Shepherds Purse': 9.'Small-flowered Cranesbill': 10.'Sugar beet': 11}
classnum = 12
batch_size = 4
np.random.seed(42)
Copy the code
Keras is integrated with tensorFlow 2.0 and later, so we do not need to install Keras separately. We can upgrade the previous code to TensorFlow 2.0 and later by adding TensorFlow in front of Keras.
With tensorflow out of the way, there are a few more important global parameters:
-
Norm_size = 224 Sets the size of the input image. ResNet50 default image size is 224 x 224.
-
Datapath = ‘data/train’ set the path to save the image. If there are many images, do not place them in the project directory, otherwise Pycharm will scan all the images when loading the project, which is very slow.
-
EPOCHS = 20. It is a puzzle how much EPOCHS is appropriate to set. In general, setting 300 is enough.
-
INIT_LR = 1E-3 learning rate, generally from 0.001 gradually reduced, also do not too small to 1E-6 can be used.
-
Classnum = 12 number of categories, the dataset has two categories, so it is divided into two categories.
-
Batch_size = 4 Batchsize. Based on hardware conditions and data set size Settings, the value is too small. The loss float is too large. Windows allows you to view the video memory usage in the task manager.
Ubuntu can use Nvidia-SMI to check video memory usage.
-
Define the random factor of numpy. Random. So we can fix the random index
Step 2 Load the image
Instead of processing images, we return only a list of image paths.
See the code for specific practices:
def loadImageData() :
imageList = []
listClasses = os.listdir(datapath) # Category folder
print(listClasses)
for class_name in listClasses:
label_id = dicClass[class_name]
class_path = os.path.join(datapath, class_name)
image_names = os.listdir(class_path)
for image_name in image_names:
image_full_path = os.path.join(class_path, image_name)
labelList.append(label_id)
imageList.append(image_full_path)
return imageList
print("Start loading data")
imageArr = loadImageData()
labelList = np.array(labelList)
print("Loading data completed")
Copy the code
Once the data is ready, we need to split the training set and the test set, usually in a 4:1 or 7:3 ratio. To split the dataset using the train_test_split() method, import the package train_test_split from sklearn.model_selection import. Ex. :
trainX, valX, trainY, valY = train_test_split(imageArr, labelList, test_size=0.2, random_state=42)
Copy the code
Step 3 image enhancement
train_transform = albumentations.Compose([
albumentations.OneOf([
albumentations.RandomGamma(gamma_limit=(60.120), p=0.9),
albumentations.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.9),
albumentations.CLAHE(clip_limit=4.0, tile_grid_size=(4.4), p=0.9),
]),
albumentations.HorizontalFlip(p=0.5),
albumentations.ShiftScaleRotate(shift_limit=0.2, scale_limit=0.2, rotate_limit=20,
interpolation=cv2.INTER_LINEAR, border_mode=cv2.BORDER_CONSTANT, p=1),
albumentations.Normalize(mean=(0.485.0.456.0.406), std=(0.229.0.224.0.225), max_pixel_value=255.0, p=1.0)
])
val_transform = albumentations.Compose([
albumentations.Normalize(mean=(0.485.0.456.0.406), std=(0.229.0.224.0.225), max_pixel_value=255.0, p=1.0)])Copy the code
This is arbitrary, the specific Settings can refer to the article I wrote before:
Image enhancement library Albumentations using summary _AIhao -CSDN blog _albumentations
Two data enhancements were written, one for training and one for validation. The verification set only needs to normalize the image.
The fourth step defines the method of image processing
The main purpose of the generator is to process images and iteratively return a batch of images with corresponding labels.
Ideas:
In the while loop:
-
Initialize input_samples and input_labels with a list to store image and its corresponding labels, respectively.
-
Batch_size number of cycles:
-
- Random index
- From file_pathList and labels, get the path and label of the image
- Read the pictures
- If it is trained, train the transform; if it is not, perform the validated transform.
- Resize images
- Convert image to array
- Put the image and label into input_samples and input_Labels, respectively
-
Convert list to numpy array.
-
Return an iteration
def generator(file_pathList,labels,batch_size,train_action=False) :
L = len(file_pathList)
while True:
input_labels = []
input_samples = []
for row in range(0, batch_size):
temp = np.random.randint(0, L)
X = file_pathList[temp]
Y = labels[temp]
image = cv2.imdecode(np.fromfile(X, dtype=np.uint8), -1)
if image.shape[2] > 3:
image = image[:, :, :3]
if train_action:
image=train_transform(image=image)['image']
else:
image = val_transform(image=image)['image'] image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4) image = img_to_array(image) input_samples.append(image) input_labels.append(Y) batch_x = np.asarray(input_samples) batch_y = np.asarray(input_labels)yield (batch_x, batch_y)
Copy the code
Step 5 Retain the best model and dynamically set the learning rate
ModelCheckpoint: Used to save the model with the best performance.
The syntax is as follows:
keras.callbacks.ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1)
Copy the code
This callback will save the model to Filepath after each epoch
Filepath can be a formatted string whose placeholders will be filled with the EPOCH value and logs keyword passed to on_EPOch_end
For example, if filepath is weights.{epoch:02d-{val_loss:.2f}}.hdf5, multiple files corresponding to the epoch and validation set Loss are generated.
parameter
- Filename: indicates the path to save the model
- Monitor: Indicates the value to be monitored
- Verbose: Indicates the information display mode. The value is 0 or 1
- Save_best_only: When set to True, only the best-performing models on the validation set are saved
- Mode: one of ‘auto’, ‘min’, ‘Max’, which determines the evaluation criteria for the best performance model when save_BEST_only =True. For example, when the monitoring value is val_ACC, the mode should be Max, and when the monitoring value is val_loss, the mode should be min. In auto mode, the evaluation criteria are automatically inferred from the name of the monitored value.
- Save_weights_only: if set to True, only model weights are saved, otherwise the whole model (including structure, configuration information, etc.) is saved
- Period: indicates the epoch number between CheckPoint points
ReduceLROnPlateau: Reduce the learning rate when the evaluation index is not improved, the syntax is as follows:
keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto', epsilon=0.0001, cooldown=0, min_lr=0)
Copy the code
When learning stagnates, it is often better to reduce the rate of learning by a factor of two or ten. This callback function detects the case of an indicator, and if no improvement in model performance is seen in patience with the epoch, the learning rate is reduced
parameter
- Monitor: Indicates the monitored quantity
- Factor: Each time the factor of learning rate is reduced, the learning rate will be reduced in the form of LR = LR *factor
- Patience: When the epoch passed and model performance did not improve, the reduced learning rate was triggered
- Mode: one of ‘auto’, ‘min’, ‘Max’. In min mode, if the detection value triggers the learning rate reduction. In Max mode, when the detection value does not rise, the learning rate is triggered to decrease.
- Epsilon: threshold used to determine whether to enter the “plain area” of the tested value
- Cooldown: After the learning rate decreases, the cooldown epoch will be used for normal operation again
- Min_lr: lower limit of learning rate
The code for this example is as follows:
checkpointer = ModelCheckpoint(filepath='best_model.hdf5',
monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
reduce = ReduceLROnPlateau(monitor='val_accuracy', patience=10,
verbose=1,
factor=0.5,
min_lr=1e-6)
Copy the code
Step 6 Model and training
model = Sequential()
model.add(ResNet50(include_top=False, pooling='avg', weights='imagenet'))
model.add(Dense(classnum, activation='softmax'))
optimizer = Adam(learning_rate=INIT_LR)
model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(generator(trainX,trainY,batch_size,train_action=True),
steps_per_epoch=len(trainX) / batch_size,
validation_data=generator(valX,valY,batch_size,train_action=False),
epochs=EPOCHS,
validation_steps=len(valX) / batch_size,
callbacks=[checkpointer, reduce])
model.save('my_model.h5')
print(history)
Copy the code
The pre-training model was not used in the last blog, and there were some mistakes in this one. After consulting the data, we found that this method was wrong, as follows:
# model = ResNet50 (weights = "imagenet", input_shape = (224224, 3), include_top = False, Classes =classnum) #include_top=False Remove the last full connection layerCopy the code
If you want to specify classes, there are two conditions: include_top: True, weights: None. Classes cannot be specified otherwise.
So classes can’t be pre-trained, so there’s another way:
model = Sequential()
model.add(ResNet50(include_top=False, pooling='avg', weights='imagenet'))
model.add(Dense(classnum, activation='softmax'))
Copy the code
In addition, fit_generator was used in the previous article. In the new version, FIT supports generator mode, so it is changed to FIT.
Step 6 Preserve the training results and generate pictures
loss_trend_graph_path = r"WW_loss.jpg"
acc_trend_graph_path = r"WW_acc.jpg"
import matplotlib.pyplot as plt
print("Now,we start drawing the loss and acc trends graph...")
# summarize history for accuracy
fig = plt.figure(1)
plt.plot(history.history["accuracy"])
plt.plot(history.history["val_accuracy"])
plt.title("Model accuracy")
plt.ylabel("accuracy")
plt.xlabel("epoch")
plt.legend(["train"."test"], loc="upper left")
plt.savefig(acc_trend_graph_path)
plt.close(1)
# summarize history for loss
fig = plt.figure(2)
plt.plot(history.history["loss"])
plt.plot(history.history["val_loss"])
plt.title("Model loss")
plt.ylabel("loss")
plt.xlabel("epoch")
plt.legend(["train"."test"], loc="upper left")
plt.savefig(loss_trend_graph_path)
plt.close(2)
print("We are done, everything seems OK...")
# # Windows system setup 10 shutdown
#os.system("shutdown -s -t 10")
Copy the code
The test part
Single picture prediction
1. Import dependencies
import cv2
import numpy as np
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.models import load_model
import time
import os
import albumentations
Copy the code
2. Set global parameters
Notice that the order of the dictionary is the same as the order of the training
norm_size=224
imagelist=[]
emotion_labels = {
0: 'Black-grass'.1: 'Charlock'.2: 'Cleavers'.3: 'Common Chickweed'.4: 'Common wheat'.5: 'Fat Hen'.6: 'Loose Silky-bent'.7: 'Maize'.8: 'Scentless Mayweed'.9: 'Shepherds Purse'.10: 'Small-flowered Cranesbill'.11: 'Sugar beet',}Copy the code
3. Set image normalization parameters
The normalized parameter Settings are consistent with the validation parameters
Val_transform = albumentations.Compose([albumentations.Normalize(mean=(0.485, 0.456, 0.406), STD =(0.229, 0.224, 0.225), max_pixel_value = 255.0, p = 1.0)])Copy the code
3. Load the model
emotion_classifier=load_model("my_model.h5")
Copy the code
4. Manipulate images
The logic for processing images is similar to that for training sets.
- Read the pictures
- Resize the image to norm_size×norm_size.
- Convert images to arrays.
- Put it in an imagelist.
- Convert list to numpy array.
image = cv2.imdecode(np.fromfile('data/test/0a64e3e6c.png', dtype=np.uint8), -1)
image = val_transform(image=image)['image']
image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
image = img_to_array(image)
imagelist.append(image)
imageList = np.array(imagelist, dtype="float")
Copy the code
5. Forecast categories
Predict categories and get the index of the highest category.
pre=np.argmax(emotion_classifier.predict(imageList))
emotion = emotion_labels[pre]
t2=time.time()
print(emotion)
t3=t2-t1
print(t3)
Copy the code
Batch predict
The difference between batch forecasting and sheet forecasting mainly lies in the reading of data and the processing of prediction categories after the prediction is completed. Nothing else has changed.
Steps:
- Load the model.
- A directory that defines test sets
- Get the picture under the directory
- Loop loop picture
- Read the pictures
- Normalize the image.
- Resize images
- Turn an array
- In the imageList
- To predict
predict_dir = 'data/test'
test11 = os.listdir(predict_dir)
for file in test11:
filepath=os.path.join(predict_dir,file)
image = cv2.imdecode(np.fromfile(filepath, dtype=np.uint8), -1)
image = val_transform(image=image)['image']
image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
image = img_to_array(image)
imagelist.append(image)
imageList = np.array(imagelist, dtype="float")
out = emotion_classifier.predict(imageList)
print(out)
pre = [np.argmax(i) for i in out]
class_name_list=[emotion_labels[i] for i in pre]
print(class_name_list)
t2 = time.time()
t3 = t2 - t1
print(t3)
Copy the code
The complete code: download.csdn.net/download/hh…