Make writing a habit together! This is the fifth day of my participation in the “Gold Digging Day New Plan · April More text Challenge”. Click here for more details.
Abstract
In this example, part of the data in the data set of plant seedlings is extracted as a data set, and the data set has 12 categories. Today, I work with you to achieve the image classification task of TensorFlow2.x version. The classification model uses MobileNet, whose core is the deep separable convolution, which can not only reduce the computational complexity of the model, Moreover, the model size can be greatly reduced. The model trained by the case used in this paper is only 38M, which is suitable for application in real mobile application scenarios.
Introductions of MobileNet can see my previous article: wanghao.blog.csdn.net/article/det…
From this article you can learn:
1. How to load picture data and process data.
2, if the label is oneHOT encoding
3. How to use data enhancement.
How to use Mixup.
5. How to slice data sets.
6. How to load the pre-training model.
training
1, the Mixup
Mixup is an unconventional data enhancement method, a simple data enhancement principle independent of data, which constructs new training samples and labels by linear interpolation. The final tag processing is shown in the following formula, which is simple but unusual for an enhancement strategy.
(xi,yi)\left (x_{I},y_{I} \right)(xi,yi), (xj,yj)\left (x_{j},y_{j} \right)(xj,yj) the two data pairs are the training sample pairs (training samples and their corresponding labels) in the original data set. Where λ\lambda lambda is a parameter subject to B distribution, λ ~ Beta(α,α)\lambda\sim Beta\left (\alpha,\alpha \right)λ ~ Beta(α,α). Beta distribution of probability density function is shown in the figure below, the alpha ∈ \ [0, + up] alpha \ \ left in [0, + \ infty \ right] alpha ∈ [0, + up]
Therefore, α\alphaα is a hyperparameter. With the increase of α\alphaα, the training error of the network will increase and its generalization ability will be enhanced. When α→∞\alpha \rightarrow \inftyα→∞, the model deforms into a primitive training strategy. Reference: www.jianshu.com/p/d22fcd86f…
Create a new mixupGenerator. py and insert the following code:
import numpy as np
class MixupGenerator() :
def __init__(self, X_train, y_train, batch_size=32, alpha=0.2, shuffle=True, datagen=None) :
self.X_train = X_train
self.y_train = y_train
self.batch_size = batch_size
self.alpha = alpha
self.shuffle = shuffle
self.sample_num = len(X_train)
self.datagen = datagen
def __call__(self) :
while True:
indexes = self.__get_exploration_order()
itr_num = int(len(indexes) // (self.batch_size * 2))
for i in range(itr_num):
batch_ids = indexes[i * self.batch_size * 2:(i + 1) * self.batch_size * 2]
X, y = self.__data_generation(batch_ids)
yield X, y
def __get_exploration_order(self) :
indexes = np.arange(self.sample_num)
if self.shuffle:
np.random.shuffle(indexes)
return indexes
def __data_generation(self, batch_ids) :
_, h, w, c = self.X_train.shape
l = np.random.beta(self.alpha, self.alpha, self.batch_size)
X_l = l.reshape(self.batch_size, 1.1.1)
y_l = l.reshape(self.batch_size, 1)
X1 = self.X_train[batch_ids[:self.batch_size]]
X2 = self.X_train[batch_ids[self.batch_size:]]
X = X1 * X_l + X2 * (1 - X_l)
if self.datagen:
for i in range(self.batch_size):
X[i] = self.datagen.random_transform(X[i])
X[i] = self.datagen.standardize(X[i])
if isinstance(self.y_train, list):
y = []
for y_train_ in self.y_train:
y1 = y_train_[batch_ids[:self.batch_size]]
y2 = y_train_[batch_ids[self.batch_size:]]
y.append(y1 * y_l + y2 * (1 - y_l))
else:
y1 = self.y_train[batch_ids[:self.batch_size]]
y2 = self.y_train[batch_ids[self.batch_size:]]
y = y1 * y_l + y2 * (1 - y_l)
return X, y
Copy the code
2. Import required data packets and set global parameters
import numpy as np
from tensorflow.keras.optimizers import Adam
import numpy as np
from tensorflow.keras.optimizers import Adam
import cv2
from tensorflow.keras.preprocessing.image import img_to_array
from sklearn.model_selection import train_test_split
from tensorflow.python.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.applications.resnet import MobileNet
import os
from tensorflow.python.keras.utils import np_utils
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.models import Sequential
from mixup_generator import MixupGenerator
norm_size = 224
datapath = 'data/train'
EPOCHS = 20
INIT_LR = 1e-3
labelList = []
dicClass = {'Black-grass': 0.'Charlock': 1.'Cleavers': 2.'Common Chickweed': 3.'Common wheat': 4.'Fat Hen': 5.'Loose Silky-bent': 6.'Maize': 7.'Scentless Mayweed': 8.'Shepherds Purse': 9.'Small-flowered Cranesbill': 10.'Sugar beet': 11}
classnum = 12
batch_size = 16
Copy the code
Keras is integrated with tensorFlow 2.0 and later, so we do not need to install Keras separately. We can upgrade the previous code to TensorFlow 2.0 and later by adding TensorFlow in front of Keras.
With tensorflow out of the way, there are a few more important global parameters:
-
Norm_size = 224, MobileNet’s default image size is 224×224.
-
Datapath = ‘data/train’ set the path to save the image. If there are many images, do not place them in the project directory, otherwise Pycharm will scan all the images when loading the project, which is very slow.
-
EPOCHS = 100 EPOCHS = 100 EPOCHS = 100 EPOCHS = 100 EPOCHS = 100 EPOCHS = 100 EPOCHS = 100 EPOCHS = 100 EPOCHS
-
INIT_LR = 1E-3 learning rate, generally from 0.001 gradually reduced, also do not too small to 1E-6 can be used.
-
Classnum = 12 number of categories, there are 12 categories in the dataset, so define 12 classes.
-
Batch_size = 16, batchsize. Based on hardware conditions and data set size Settings, the value is too small. Loss floating is too large. Windows allows you to view the video memory usage in the task manager.
Ubuntu can use Nvidia-SMI to check video memory usage.
3. Load images
Image processing steps:
- Read the image
- Resize the image with the specified size.
- Convert the image to an array
- Image normalization
- Label onehot
See the code for specific practices:
def loadImageData() :
imageList = []
listClasses = os.listdir(datapath)# Category folder
print(listClasses)
for class_name in listClasses:
label_id = dicClass[class_name]
class_path=os.path.join(datapath,class_name)
image_names=os.listdir(class_path)
for image_name in image_names:
image_full_path = os.path.join(class_path, image_name)
labelList.append(label_id)
image = cv2.imdecode(np.fromfile(image_full_path, dtype=np.uint8), -1)
image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
if image.shape[2] >3:
image=image[:,:,:3]
print(image.shape)
image = img_to_array(image)
imageList.append(image)
imageList = np.array(imageList) / 255.0
return imageList
print("Start loading data")
imageArr = loadImageData()
print(type(imageArr))
labelList = np.array(labelList)
print("Loading data completed")
print(labelList)
labelList = np_utils.to_categorical(labelList, classnum)
print(labelList)
Copy the code
Once the data is ready, we need to split the training set and the test set, usually in a 4:1 or 7:3 ratio. To split the dataset using the train_test_split() method, import the package train_test_split from sklearn.model_selection import. Ex. :
trainX, valX, trainY, valY = train_test_split(imageArr, labelList, test_size=0.2, random_state=42)
Copy the code
4. Image enhancement
ImageDataGenerator () is keras. Preprocessing. Pictures of image module generator, at the same time can also to enhance the data in the batch, expand the data set size, strengthen the generalization ability of the model. You can do rotations, transformations, normalization, and so on.
keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,samplewise_center
=False, featurewise_std_normalization=False, samplewise_std_normalization=False,zca_whitening=False,
zca_epsilon=1e-06, rotation_range=0.0, width_shift_range=0.0, height_shift_range=0.0,brightness_range=None, shear_range=0.0, zoom_range=0.0,channel_shift_range=0.0, fill_mode='nearest', cval=0.0, horizontal_flip=False, vertical_flip=False, rescale=None, preprocessing_function=None,data_format=None,validation_split=0.0)
Copy the code
Parameters:
- Featurewise_center: Boolean. Subtract the mean value of each channel from the input image.
- Samplewise_center: Boolan. Subtract the sample mean from each image so that each sample mean is 0.
- featurewise_std_normalization(): Boolean()
- samplewise_std_normalization(): Boolean()
- zca_epsilon(): Default 12-6
- Zca_whitening: Boolean. Removes correlations between samples
- Rotation_range (): rotation range
- Width_shift_range (): horizontal shift range
- Height_shift_range (): vertical shift range
- Shear_range (): float, the range of perspective transforms
- Zoom_range (): zooming range
- Fill_mode: Fill mode, constant, nearest, reflect
- Cval: fill_mode == ‘constant’
- Horizontal_flip (): horizontal reversal
- Vertical_flip (): vertical flip
- Preprocessing_function (): handler function provided by user
- Data_format (): channels_first or channels_last
- Validation_split (): How much data is used for the validation set
The image enhancement code used in this example is as follows:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
val_datagen = ImageDataGenerator() # Validation set does not do image enhancement
training_generator_mix = MixupGenerator(trainX, trainY, batch_size=batch_size, alpha=0.2, datagen=train_datagen)()
val_generator = val_datagen.flow(valX, valY, batch_size=batch_size, shuffle=True)
Copy the code
5. Retain the best model and dynamically set the learning rate
ModelCheckpoint: Used to save the model with the best performance.
The syntax is as follows:
keras.callbacks.ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1)
Copy the code
This callback will save the model to Filepath after each epoch
Filepath can be a formatted string whose placeholders will be filled with the EPOCH value and logs keyword passed to on_EPOch_end
For example, if filepath is weights.{epoch:02d-{val_loss:.2f}}.hdf5, multiple files corresponding to the epoch and validation set Loss are generated.
parameter
- Filename: indicates the path to save the model
- Monitor: Indicates the value to be monitored
- Verbose: Indicates the information display mode. The value is 0 or 1
- Save_best_only: When set to True, only the best-performing models on the validation set are saved
- Mode: one of ‘auto’, ‘min’, ‘Max’, which determines the evaluation criteria for the best performance model when save_BEST_only =True. For example, when the monitoring value is val_ACC, the mode should be Max, and when the monitoring value is val_loss, the mode should be min. In auto mode, the evaluation criteria are automatically inferred from the name of the monitored value.
- Save_weights_only: if set to True, only model weights are saved, otherwise the whole model (including structure, configuration information, etc.) is saved
- Period: indicates the epoch number between CheckPoint points
ReduceLROnPlateau: Reduce the learning rate when the evaluation index is not improved, the syntax is as follows:
keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto', epsilon=0.0001, cooldown=0, min_lr=0)
Copy the code
When learning stagnates, it is often better to reduce the rate of learning by a factor of two or ten. This callback function detects the case of an indicator, and if no improvement in model performance is seen in patience with the epoch, the learning rate is reduced
parameter
- Monitor: Indicates the monitored quantity
- Factor: Each time the factor of learning rate is reduced, the learning rate will be reduced in the form of LR = LR *factor
- Patience: When the epoch passed and model performance did not improve, the reduced learning rate was triggered
- Mode: one of ‘auto’, ‘min’, ‘Max’. In min mode, if the detection value triggers the learning rate reduction. In Max mode, when the detection value does not rise, the learning rate is triggered to decrease.
- Epsilon: threshold used to determine whether to enter the “plain area” of the tested value
- Cooldown: After the learning rate decreases, the cooldown epoch will be used for normal operation again
- Min_lr: lower limit of learning rate
The code for this example is as follows:
checkpointer = ModelCheckpoint(filepath='best_model.hdf5',
monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
reduce = ReduceLROnPlateau(monitor='val_accuracy', patience=10,
verbose=1,
factor=0.5,
min_lr=1e-6)
Copy the code
6. Model and train
model = Sequential()
model.add(MobileNet(include_top=False, pooling='avg', weights='imagenet'))
model.add(Dense(classnum, activation='softmax'))
model.summary()
optimizer = Adam(learning_rate=INIT_LR)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(training_generator_mix,
steps_per_epoch=trainX.shape[0] / batch_size,
validation_data=val_generator,
epochs=EPOCHS,
validation_steps=valX.shape[0] / batch_size,
callbacks=[checkpointer, reduce])
model.save('my_model.h5')
print(history)
Copy the code
Running results:
With the increase of training times, the accuracy rate has reached 0.97.
7. Keep the training results and generate pictures
loss_trend_graph_path = r"WW_loss.jpg"
acc_trend_graph_path = r"WW_acc.jpg"
import matplotlib.pyplot as plt
print("Now,we start drawing the loss and acc trends graph...")
# summarize history for accuracy
fig = plt.figure(1)
plt.plot(history.history["accuracy"])
plt.plot(history.history["val_accuracy"])
plt.title("Model accuracy")
plt.ylabel("accuracy")
plt.xlabel("epoch")
plt.legend(["train"."test"], loc="upper left")
plt.savefig(acc_trend_graph_path)
plt.close(1)
# summarize history for loss
fig = plt.figure(2)
plt.plot(history.history["loss"])
plt.plot(history.history["val_loss"])
plt.title("Model loss")
plt.ylabel("loss")
plt.xlabel("epoch")
plt.legend(["train"."test"], loc="upper left")
plt.savefig(loss_trend_graph_path)
plt.close(2)
print("We are done, everything seems OK...")
# # Windows system setup 10 shutdown
#os.system("shutdown -s -t 10")
Copy the code
Results:
The test part
Single picture prediction
1. Import dependencies
import cv2
import numpy as np
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.models import load_model
import time
Copy the code
2. Set global parameters
Notice that the order of the dictionary is the same as the order of the training
norm_size=224
imagelist=[]
emotion_labels = {
0: 'Black-grass'.1: 'Charlock'.2: 'Cleavers'.3: 'Common Chickweed'.4: 'Common wheat'.5: 'Fat Hen'.6: 'Loose Silky-bent'.7: 'Maize'.8: 'Scentless Mayweed'.9: 'Shepherds Purse'.10: 'Small-flowered Cranesbill'.11: 'Sugar beet',}Copy the code
3. Load the model
emotion_classifier=load_model("best_model.hdf5")
t1=time.time()
Copy the code
4. Manipulate images
The logic for processing images is similar to that for training sets.
- Read the pictures
- Resize the image to norm_size×norm_size.
- Convert images to arrays.
- Put it in an imagelist.
- Divide the imagelist by 255 and scale it to between 0 and 1.
image = cv2.imdecode(np.fromfile('data/test/0a64e3e6c.png', dtype=np.uint8), -1)
# load the image, pre-process it, and store it in the data list
image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
image = img_to_array(image)
imagelist.append(image)
imageList = np.array(imagelist, dtype="float") / 255.0
Copy the code
5. Forecast categories
Predict categories and get the index of the highest category.
out=emotion_classifier.predict(imageList)
print(out)
pre=np.argmax(out)
emotion = emotion_labels[pre]
t2=time.time()
print(emotion)
t3=t2-t1
print(t3)
Copy the code
Running results:
[[1.7556800E-03 8.5450716E-07 1.9150861E-05 1.9705877E-07 9.9732012E-01 8.0649025E-04 2.5912817E-07 2.2540871E-06 ] Common Wheat 3.50178861618042
Batch predict
The difference between batch forecasting and sheet forecasting mainly lies in the reading of data and the processing of prediction categories after the prediction is completed. Nothing else has changed.
Steps:
- Load the model.
- A directory that defines test sets
- Get the picture under the directory
- Loop loop picture
- Read the pictures
- Resize images
- Turn an array
- In the imageList
- Scale 0 to 255.
- To predict
emotion_classifier=load_model("best_model.hdf5")
t1=time.time()
predict_dir = 'data/test'
test11 = os.listdir(predict_dir)
for file in test11:
filepath=os.path.join(predict_dir,file)
image = cv2.imdecode(np.fromfile(filepath, dtype=np.uint8), -1)
# load the image, pre-process it, and store it in the data list
image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
image = img_to_array(image)
imagelist.append(image)
imageList = np.array(imagelist, dtype="float") / 255.0
out = emotion_classifier.predict(imageList)
print(out)
pre = [np.argmax(i) for i in out]
Copy the code
Running results:
Complete code:Download.csdn.net/download/hh…