First, preliminary work

This paper will realize the recognition of characters in the spirit cage. Compared to the previous article, THIS time I adopted the VGG-19 structure and added two parts: prediction and save and load model.

My environment:

Locale: Python3.6.5
Compiler: Jupyter Notebook
Deep learning environment: TensorFlow2.4.1

Recommended Reading:

Depth study of 100 cases – convolution neural network (CNN) implementation mnist handwritten numeral recognition | 1 day
Deep learning 100 cases – convolution neural network (CNN) color image classification | 2 days
Depth study of 100 cases – convolution neural network (CNN) weather identification | 5 days
Depth study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day

From the column:100 Examples of Deep Learning

1. Set the GPU

You can skip this step if you are using a CPU

import tensorflow as tf

gpus = tf.config.list_physical_devices("GPU")

if gpus:
    tf.config.experimental.set_memory_growth(gpus[0].True)  # Set GPU memory usage as required
    tf.config.set_visible_devices([gpus[0]],"GPU")
Copy the code

2. Import data

import matplotlib.pyplot as plt
# Support Chinese
plt.rcParams['font.sans-serif'] = ['SimHei']  # is used to display Chinese labels normally
plt.rcParams['axes.unicode_minus'] = False  # is used to display the negative sign normally

import os,PIL

# Set random seeds to reproduce the results as much as possible
import numpy as np
np.random.seed(1)

# Set random seeds to reproduce the results as much as possible
import tensorflow as tf
tf.random.set_seed(1)

from tensorflow import keras
from tensorflow.keras import layers,models

import pathlib
Copy the code

data_dir = "D:/jupyter notebook/DL-100-days/datasets/linglong_photos"

data_dir = pathlib.Path(data_dir)
Copy the code

3. View data

There are six characters in the data set, such as Bai Yuekui, Charles, Hongkou, Mark, Morgan and Ran Bing.

folder	meaning	The number of
baiyuekui	White in Quebec	40
chaersi	Charles	76
hongkou	Red Chloe	36
make	mark	38 a
mogen	Morgan	30 pieces of
ranbing	RanBing	60 pieces of

image_count = len(list(data_dir.glob('* / *')))

print("The total number of pictures is:",image_count)
Copy the code

Total number of pictures: 280Copy the code

2. Data preprocessing

1. Load data

Use the image_DATASet_from_directory method to load the data from the disk into tf.data.dataset

batch_size = 16
img_height = 224
img_width = 224
Copy the code

"" "about image_dataset_from_directory () articles detailing can refer to: https://mtyjkh.blog.csdn.net/article/details/117018789, "" "
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.1,
    subset="training",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Copy the code

Found 280 files belonging to 6 classes.
Using 252 files for training.
Copy the code

"" "about image_dataset_from_directory () articles detailing can refer to: https://mtyjkh.blog.csdn.net/article/details/117018789, "" "
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.1,
    subset="validation",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Copy the code

Found 280 files belonging to 6 classes.
Using 28 files for validation.
Copy the code

We can output the labels of the dataset through class_names. The labels will correspond alphabetically to the directory name.

class_names = train_ds.class_names
print(class_names)
Copy the code

['baiyuekui', 'chaersi', 'hongkou', 'make', 'mogen', 'ranbing']
Copy the code

2. Visualize data

plt.figure(figsize=(10.5))  The width of the figure is 10 and the height is 5
plt.suptitle("Wechat official Account: STUDENT K")

for images, labels in train_ds.take(1) :for i in range(8):
        
        ax = plt.subplot(2.4, i + 1)  

        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        
        plt.axis("off")
Copy the code

plt.imshow(images[1].numpy().astype("uint8"))
Copy the code

3. Check the data again

for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break
Copy the code

(16, 224, 224, 3)
(16,)
Copy the code

Image_batchIs the tensor of the shape (32,180,180,3). This is a batch of 32 images with the shape 180x180x3 (the last dimension refers to the color channel RGB).
Label_batchIs the tensor of the shape (32,), and these labels correspond to 32 pictures

4. Configure the data set

Shuffle () : disturb data, detailed introduction about this function can be reference: zhuanlan.zhihu.com/p/42417456
Prefetch () : The process of prefetching data to speed up a run is described in my previous two articles.
Cache () : The data set is cached in memory to speed up operation

AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
Copy the code

5. The normalized

normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)

normalization_train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
val_ds = val_ds.map(lambda x, y: (normalization_layer(x), y))
Copy the code

image_batch, labels_batch = next(iter(val_ds))
first_image = image_batch[0]

# View normalized data
print(np.min(first_image), np.max(first_image))
Copy the code

0.00390696 1.0
Copy the code

Iii. Build vGG-19 network

Between the official model and self-built model to choose one, choose a comment out another, are legitimate VGG-19 ha.

VGG advantages and disadvantages analysis:

VGG advantages

The structure of VGG is very simple, with the same convolution kernel size (3×3) and maximum pooling size (2×2) used throughout the network.

VGG shortcomings

1) The training time is too long and the adjustment is difficult. 2) The required storage capacity is large, which is not conducive to deployment. For example, the size of the vGG-16 weight value file is more than 500 MB, which is not suitable for installation in an embedded system.

1. Official model (packaged)

I’ll leave this part of the official website model invocation in the next few articles, but I’ll focus on VGG-19

# model = keras.applications.VGG19(weights='imagenet')
# model.summary()
Copy the code

2. Self-built model

from tensorflow.keras import layers, models, Input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout

def VGG19(nb_classes, input_shape) :
    input_tensor = Input(shape=input_shape)
    # 1st block
    x = Conv2D(64, (3.3), activation='relu', padding='same',name='block1_conv1')(input_tensor)
    x = Conv2D(64, (3.3), activation='relu', padding='same',name='block1_conv2')(x)
    x = MaxPooling2D((2.2), strides=(2.2), name = 'block1_pool')(x)
    # 2nd block
    x = Conv2D(128, (3.3), activation='relu', padding='same',name='block2_conv1')(x)
    x = Conv2D(128, (3.3), activation='relu', padding='same',name='block2_conv2')(x)
    x = MaxPooling2D((2.2), strides=(2.2), name = 'block2_pool')(x)
    # 3rd block
    x = Conv2D(256, (3.3), activation='relu', padding='same',name='block3_conv1')(x)
    x = Conv2D(256, (3.3), activation='relu', padding='same',name='block3_conv2')(x)
    x = Conv2D(256, (3.3), activation='relu', padding='same',name='block3_conv3')(x)
    x = Conv2D(256, (3.3), activation='relu', padding='same',name='block3_conv4')(x)
    x = MaxPooling2D((2.2), strides=(2.2), name = 'block3_pool')(x)
    # 4th block
    x = Conv2D(512, (3.3), activation='relu', padding='same',name='block4_conv1')(x)
    x = Conv2D(512, (3.3), activation='relu', padding='same',name='block4_conv2')(x)
    x = Conv2D(512, (3.3), activation='relu', padding='same',name='block4_conv3')(x)
    x = Conv2D(512, (3.3), activation='relu', padding='same',name='block4_conv4')(x)
    x = MaxPooling2D((2.2), strides=(2.2), name = 'block4_pool')(x)
    # 5th block
    x = Conv2D(512, (3.3), activation='relu', padding='same',name='block5_conv1')(x)
    x = Conv2D(512, (3.3), activation='relu', padding='same',name='block5_conv2')(x)
    x = Conv2D(512, (3.3), activation='relu', padding='same',name='block5_conv3')(x)
    x = Conv2D(512, (3.3), activation='relu', padding='same',name='block5_conv4')(x)
    x = MaxPooling2D((2.2), strides=(2.2), name = 'block5_pool')(x)
    # full connection
    x = Flatten()(x)
    x = Dense(4096, activation='relu',  name='fc1')(x)
    x = Dense(4096, activation='relu', name='fc2')(x)
    output_tensor = Dense(nb_classes, activation='softmax', name='predictions')(x)

    model = Model(input_tensor, output_tensor)
    return model

model=VGG19(1000, (img_width, img_height, 3))
model.summary()
Copy the code

Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 224, 224, 3)] 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_conv4 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_conv4 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv4 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 _________________________________________________________________ flatten (Flatten) (None, 25088) 0 _________________________________________________________________ fc1 (Dense) (None, 4096) 102764544 _________________________________________________________________ fc2 (Dense) (None, 4096) 16781312 _________________________________________________________________ predictions (Dense) (None, 1000) 4097000 ================================================================= Total params: 143,667,240 non-trainable Params: 0 _________________________________________________________________Copy the code

3. Network structure diagram

Knowledge can refer to the article about convolution calculation: mtyjkh.blog.csdn.net/article/det…

Structure description:

The 16 Convolutional layers are used respectivelyblockX_convXsaid
Three Fully connected layers are used separatelyfcXwithpredictionssaid
Five Pool layers, respectivelyblockX_poolsaid

Vgg-19 contains 19 hidden layers (16 convolution layers and 3 fully connected layers), so it is called VGG-19

Four, compile,

Before you are ready to train the model, you need to set it up a little more. The following was added in the build step of the model:

Loss function: Used to measure the accuracy of the model during training.
Optimizer: Determines how the model is updated based on the data it sees and its own loss function.
Metrics: Used to monitor training and testing steps. The following example uses accuracy, which is the ratio of images that are correctly classified.

# Set optimizer, I changed the learning rate here.
opt = tf.keras.optimizers.Nadam(learning_rate=1e-5)

model.compile(optimizer=opt,
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
Copy the code

5. Training model

epochs = 10

history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs
)
Copy the code

Epoch 1/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 12 276 ms/s step - loss: 5.4474 accuracy: 0.1501 - val_loss: 6.8601 - val_accuracy: 0.0714 Epoch 2/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 133 ms/s step - loss: 1.7873 - accuracy: 0.3191-val_loss: 6.8396-val_accuracy: 0.4643 Epoch 3/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 137 ms/s step - loss: 1.4631 accuracy: 0.4250 - val_loss: 6.8453 - val_accuracy: 0.5714 Epoch 4/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: 1.1500 - accuracy: 0.6090 - val_loss: 6.8554-val_accuracy: 0.3571 Epoch 5/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: 1.0349 accuracy: 0.6292 - val_loss: 6.8421 - val_accuracy: 0.4643 Epoch 6/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 131 ms/s step - loss: 1.0131-accuracy: 0.5919-val_loss: 6.8288-val_accuracy: 0.5714 Epoch 7/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 131 ms/s step - loss: 0.6961 accuracy: 0.7776 - val_loss: 6.8388 - val_accuracy: 0.6429 Epoch 8/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: 0.3716 - accuracy: 0.8975 - val_loss: 6.8132 - val_accuracy: 0.5714 Epoch 9/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: 0.3372 accuracy: 0.8586 - val_loss: 6.8059 - val_accuracy: 0.6071 Epoch 10/10 16/16 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 2 130 ms/s step - loss: Accuracy: 0.9736-val_loss: 6.7767-val_accuracy: 0.8929Copy the code

Vi. Model evaluation

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(12.4))
plt.subplot(1.2.1)
plt.suptitle("Wechat official Account: STUDENT K")

plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1.2.2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
Copy the code

In order to reflect the original VGG-19, the model parameters are not modified in this paper, but the correlation parameters in the model can be modified according to the actual situation to adapt to the actual situation so as to improve the classification effect.

An article on the study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day 】 I made three changes as follows:

The model fromVGG-16Instead ofVGG-19.
Change the learning_rate from1e-4Change to1e-5
The data set was changed

Do you seem to understand something

It doesn’t matter if you don’t understand, and then explain one by one, here to give you an experience first

Save and load the model

This is the simplest way to save and load a model

# Save model
model.save('model/my_model.h5')
Copy the code

# Load model
new_model = keras.models.load_model('model/my_model.h5')
Copy the code

Eight, to predict

# Use the loaded model (new_model) to see the prediction results

plt.figure(figsize=(10.5))  The width of the figure is 10 and the height is 5
plt.suptitle("Wechat official Account: STUDENT K")

for images, labels in val_ds.take(1) :for i in range(8):
        ax = plt.subplot(2.4, i + 1)  
        
        # Display images
        plt.imshow(images[i])
        
        # Need to add a dimension to the image
        img_array = tf.expand_dims(images[i], 0) 
        
        Use the model to predict the people in the picture
        predictions = new_model.predict(img_array)
        plt.title(class_names[np.argmax(predictions)])

        plt.axis("off")
Copy the code

VGG-19 this article actually buried a lot of pits, I am very clever to hide it do not know if you have found. Feel free to discuss your findings in the comments below. For a perfectionist, these imperfections are hard to watch. Let’s see if we can do a couple of articles on that.

Recommended Reading:

Depth study of 100 cases – convolution neural network (CNN) implementation mnist handwritten numeral recognition | 1 day
Deep learning 100 cases – convolution neural network (CNN) color image classification | 2 days
Depth study of 100 cases – convolution neural network (CNN) weather identification | 5 days
Depth study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day

From the column:100 Examples of Deep Learning

Wechat. Search a search [K students ah], attention after reply [DL+7] can obtain data

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Depth study of 100 cases (VGG – 19) – convolution neural network to identify the spirit the characters in the cage | 7 days

First, preliminary work

1. Set the GPU

2. Import data

3. View data

2. Data preprocessing

1. Load data

2. Visualize data

3. Check the data again

4. Configure the data set

5. The normalized

Iii. Build vGG-19 network

1. Official model (packaged)

2. Self-built model

3. Network structure diagram

Four, compile,

5. Training model

Vi. Model evaluation

Save and load the model

Eight, to predict

Depth study of 100 cases (VGG – 19) – convolution neural network to identify the spirit the characters in the cage | 7 days

First, preliminary work

1. Set the GPU

2. Import data

3. View data

2. Data preprocessing

1. Load data

2. Visualize data

3. Check the data again

4. Configure the data set

5. The normalized

Iii. Build vGG-19 network

1. Official model (packaged)

2. Self-built model

3. Network structure diagram

Four, compile,

5. Training model

Vi. Model evaluation

Save and load the model

Eight, to predict

Related Posts

The first batch of AI Damo list of Sound Valley of China was announced

Bayes, the basic principles and examples of naive Bayes

Grass recognition based on MATLAB GUI morphology of Matang grass + ox tendon grass recognition