First, preliminary work
This paper will realize the recognition of characters in the spirit cage. Compared to the previous article, THIS time I adopted the VGG-19 structure and added two parts: prediction and save and load model.
My environment:
- Locale: Python3.6.5
- Compiler: Jupyter Notebook
- Deep learning environment: TensorFlow2.4.1
Recommended Reading:
- Depth study of 100 cases – convolution neural network (CNN) implementation mnist handwritten numeral recognition | 1 day
- Deep learning 100 cases – convolution neural network (CNN) color image classification | 2 days
- Depth study of 100 cases – convolution neural network (CNN) weather identification | 5 days
- Depth study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day
From the column:100 Examples of Deep Learning
1. Set the GPU
You can skip this step if you are using a CPU
import tensorflow as tf
gpus = tf.config.list_physical_devices("GPU")
if gpus:
tf.config.experimental.set_memory_growth(gpus[0].True) # Set GPU memory usage as required
tf.config.set_visible_devices([gpus[0]],"GPU")
Copy the code
2. Import data
import matplotlib.pyplot as plt
# Support Chinese
plt.rcParams['font.sans-serif'] = ['SimHei'] # is used to display Chinese labels normally
plt.rcParams['axes.unicode_minus'] = False # is used to display the negative sign normally
import os,PIL
# Set random seeds to reproduce the results as much as possible
import numpy as np
np.random.seed(1)
# Set random seeds to reproduce the results as much as possible
import tensorflow as tf
tf.random.set_seed(1)
from tensorflow import keras
from tensorflow.keras import layers,models
import pathlib
Copy the code
data_dir = "D:/jupyter notebook/DL-100-days/datasets/linglong_photos"
data_dir = pathlib.Path(data_dir)
Copy the code
3. View data
There are six characters in the data set, such as Bai Yuekui, Charles, Hongkou, Mark, Morgan and Ran Bing.
folder | meaning | The number of |
---|---|---|
baiyuekui | White in Quebec | 40 |
chaersi | Charles | 76 |
hongkou | Red Chloe | 36 |
make | mark | 38 a |
mogen | Morgan | 30 pieces of |
ranbing | RanBing | 60 pieces of |
image_count = len(list(data_dir.glob('* / *')))
print("The total number of pictures is:",image_count)
Copy the code
Total number of pictures: 280Copy the code
2. Data preprocessing
1. Load data
Use the image_DATASet_from_directory method to load the data from the disk into tf.data.dataset
batch_size = 16
img_height = 224
img_width = 224
Copy the code
"" "about image_dataset_from_directory () articles detailing can refer to: https://mtyjkh.blog.csdn.net/article/details/117018789, "" "
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.1,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
Copy the code
Found 280 files belonging to 6 classes.
Using 252 files for training.
Copy the code
"" "about image_dataset_from_directory () articles detailing can refer to: https://mtyjkh.blog.csdn.net/article/details/117018789, "" "
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.1,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
Copy the code
Found 280 files belonging to 6 classes.
Using 28 files for validation.
Copy the code
We can output the labels of the dataset through class_names. The labels will correspond alphabetically to the directory name.
class_names = train_ds.class_names
print(class_names)
Copy the code
['baiyuekui', 'chaersi', 'hongkou', 'make', 'mogen', 'ranbing']
Copy the code
2. Visualize data
plt.figure(figsize=(10.5)) The width of the figure is 10 and the height is 5
plt.suptitle("Wechat official Account: STUDENT K")
for images, labels in train_ds.take(1) :for i in range(8):
ax = plt.subplot(2.4, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i]])
plt.axis("off")
Copy the code
plt.imshow(images[1].numpy().astype("uint8"))
Copy the code
3. Check the data again
for image_batch, labels_batch in train_ds:
print(image_batch.shape)
print(labels_batch.shape)
break
Copy the code
(16, 224, 224, 3)
(16,)
Copy the code
Image_batch
Is the tensor of the shape (32,180,180,3). This is a batch of 32 images with the shape 180x180x3 (the last dimension refers to the color channel RGB).Label_batch
Is the tensor of the shape (32,), and these labels correspond to 32 pictures
4. Configure the data set
- Shuffle () : disturb data, detailed introduction about this function can be reference: zhuanlan.zhihu.com/p/42417456
- Prefetch () : The process of prefetching data to speed up a run is described in my previous two articles.
- Cache () : The data set is cached in memory to speed up operation
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
Copy the code
5. The normalized
normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)
normalization_train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
val_ds = val_ds.map(lambda x, y: (normalization_layer(x), y))
Copy the code
image_batch, labels_batch = next(iter(val_ds))
first_image = image_batch[0]
# View normalized data
print(np.min(first_image), np.max(first_image))
Copy the code
0.00390696 1.0
Copy the code
Iii. Build vGG-19 network
Between the official model and self-built model to choose one, choose a comment out another, are legitimate VGG-19 ha.
VGG advantages and disadvantages analysis:
- VGG advantages
The structure of VGG is very simple, with the same convolution kernel size (3×3) and maximum pooling size (2×2) used throughout the network.
- VGG shortcomings
1) The training time is too long and the adjustment is difficult. 2) The required storage capacity is large, which is not conducive to deployment. For example, the size of the vGG-16 weight value file is more than 500 MB, which is not suitable for installation in an embedded system.
1. Official model (packaged)
I’ll leave this part of the official website model invocation in the next few articles, but I’ll focus on VGG-19
# model = keras.applications.VGG19(weights='imagenet')
# model.summary()
Copy the code
2. Self-built model
from tensorflow.keras import layers, models, Input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout
def VGG19(nb_classes, input_shape) :
input_tensor = Input(shape=input_shape)
# 1st block
x = Conv2D(64, (3.3), activation='relu', padding='same',name='block1_conv1')(input_tensor)
x = Conv2D(64, (3.3), activation='relu', padding='same',name='block1_conv2')(x)
x = MaxPooling2D((2.2), strides=(2.2), name = 'block1_pool')(x)
# 2nd block
x = Conv2D(128, (3.3), activation='relu', padding='same',name='block2_conv1')(x)
x = Conv2D(128, (3.3), activation='relu', padding='same',name='block2_conv2')(x)
x = MaxPooling2D((2.2), strides=(2.2), name = 'block2_pool')(x)
# 3rd block
x = Conv2D(256, (3.3), activation='relu', padding='same',name='block3_conv1')(x)
x = Conv2D(256, (3.3), activation='relu', padding='same',name='block3_conv2')(x)
x = Conv2D(256, (3.3), activation='relu', padding='same',name='block3_conv3')(x)
x = Conv2D(256, (3.3), activation='relu', padding='same',name='block3_conv4')(x)
x = MaxPooling2D((2.2), strides=(2.2), name = 'block3_pool')(x)
# 4th block
x = Conv2D(512, (3.3), activation='relu', padding='same',name='block4_conv1')(x)
x = Conv2D(512, (3.3), activation='relu', padding='same',name='block4_conv2')(x)
x = Conv2D(512, (3.3), activation='relu', padding='same',name='block4_conv3')(x)
x = Conv2D(512, (3.3), activation='relu', padding='same',name='block4_conv4')(x)
x = MaxPooling2D((2.2), strides=(2.2), name = 'block4_pool')(x)
# 5th block
x = Conv2D(512, (3.3), activation='relu', padding='same',name='block5_conv1')(x)
x = Conv2D(512, (3.3), activation='relu', padding='same',name='block5_conv2')(x)
x = Conv2D(512, (3.3), activation='relu', padding='same',name='block5_conv3')(x)
x = Conv2D(512, (3.3), activation='relu', padding='same',name='block5_conv4')(x)
x = MaxPooling2D((2.2), strides=(2.2), name = 'block5_pool')(x)
# full connection
x = Flatten()(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
output_tensor = Dense(nb_classes, activation='softmax', name='predictions')(x)
model = Model(input_tensor, output_tensor)
return model
model=VGG19(1000, (img_width, img_height, 3))
model.summary()
Copy the code
Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 224, 224, 3)] 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_conv4 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_conv4 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv4 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 _________________________________________________________________ flatten (Flatten) (None, 25088) 0 _________________________________________________________________ fc1 (Dense) (None, 4096) 102764544 _________________________________________________________________ fc2 (Dense) (None, 4096) 16781312 _________________________________________________________________ predictions (Dense) (None, 1000) 4097000 ================================================================= Total params: 143,667,240 non-trainable Params: 0 _________________________________________________________________Copy the code
3. Network structure diagram
Knowledge can refer to the article about convolution calculation: mtyjkh.blog.csdn.net/article/det…
Structure description:
- The 16 Convolutional layers are used respectively
blockX_convX
said - Three Fully connected layers are used separately
fcX
withpredictions
said - Five Pool layers, respectively
blockX_pool
said
Vgg-19 contains 19 hidden layers (16 convolution layers and 3 fully connected layers), so it is called VGG-19
Four, compile,
Before you are ready to train the model, you need to set it up a little more. The following was added in the build step of the model:
- Loss function: Used to measure the accuracy of the model during training.
- Optimizer: Determines how the model is updated based on the data it sees and its own loss function.
- Metrics: Used to monitor training and testing steps. The following example uses accuracy, which is the ratio of images that are correctly classified.
# Set optimizer, I changed the learning rate here.
opt = tf.keras.optimizers.Nadam(learning_rate=1e-5)
model.compile(optimizer=opt,
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
Copy the code
5. Training model
epochs = 10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
Copy the code
Epoch 1/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 12 276 ms/s step - loss: 5.4474 accuracy: 0.1501 - val_loss: 6.8601 - val_accuracy: 0.0714 Epoch 2/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 133 ms/s step - loss: 1.7873 - accuracy: 0.3191-val_loss: 6.8396-val_accuracy: 0.4643 Epoch 3/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 137 ms/s step - loss: 1.4631 accuracy: 0.4250 - val_loss: 6.8453 - val_accuracy: 0.5714 Epoch 4/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: 1.1500 - accuracy: 0.6090 - val_loss: 6.8554-val_accuracy: 0.3571 Epoch 5/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: 1.0349 accuracy: 0.6292 - val_loss: 6.8421 - val_accuracy: 0.4643 Epoch 6/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 131 ms/s step - loss: 1.0131-accuracy: 0.5919-val_loss: 6.8288-val_accuracy: 0.5714 Epoch 7/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 131 ms/s step - loss: 0.6961 accuracy: 0.7776 - val_loss: 6.8388 - val_accuracy: 0.6429 Epoch 8/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: 0.3716 - accuracy: 0.8975 - val_loss: 6.8132 - val_accuracy: 0.5714 Epoch 9/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: 0.3372 accuracy: 0.8586 - val_loss: 6.8059 - val_accuracy: 0.6071 Epoch 10/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: Accuracy: 0.9736-val_loss: 6.7767-val_accuracy: 0.8929Copy the code
Vi. Model evaluation
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(12.4))
plt.subplot(1.2.1)
plt.suptitle("Wechat official Account: STUDENT K")
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1.2.2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
Copy the code
In order to reflect the original VGG-19, the model parameters are not modified in this paper, but the correlation parameters in the model can be modified according to the actual situation to adapt to the actual situation so as to improve the classification effect.
An article on the study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day 】 I made three changes as follows:
- The model from
VGG-16
Instead ofVGG-19
. - Change the learning_rate from
1e-4
Change to1e-5
- The data set was changed
Do you seem to understand something
It doesn’t matter if you don’t understand, and then explain one by one, here to give you an experience first
Save and load the model
This is the simplest way to save and load a model
# Save model
model.save('model/my_model.h5')
Copy the code
# Load model
new_model = keras.models.load_model('model/my_model.h5')
Copy the code
Eight, to predict
# Use the loaded model (new_model) to see the prediction results
plt.figure(figsize=(10.5)) The width of the figure is 10 and the height is 5
plt.suptitle("Wechat official Account: STUDENT K")
for images, labels in val_ds.take(1) :for i in range(8):
ax = plt.subplot(2.4, i + 1)
# Display images
plt.imshow(images[i])
# Need to add a dimension to the image
img_array = tf.expand_dims(images[i], 0)
Use the model to predict the people in the picture
predictions = new_model.predict(img_array)
plt.title(class_names[np.argmax(predictions)])
plt.axis("off")
Copy the code
VGG-19 this article actually buried a lot of pits, I am very clever to hide it do not know if you have found. Feel free to discuss your findings in the comments below. For a perfectionist, these imperfections are hard to watch. Let’s see if we can do a couple of articles on that.
Recommended Reading:
- Depth study of 100 cases – convolution neural network (CNN) implementation mnist handwritten numeral recognition | 1 day
- Deep learning 100 cases – convolution neural network (CNN) color image classification | 2 days
- Depth study of 100 cases – convolution neural network (CNN) weather identification | 5 days
- Depth study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day
From the column:100 Examples of Deep Learning
Wechat. Search a search [K students ah], attention after reply [DL+7] can obtain data