I. Preliminary work
My environment:
- Language: Python3.6.5
- Compiler: Jupyter Notebook
- Deep learning environment: TensorFlow2.4.1
Recommended reading:
- Depth study of 100 cases (VGG – 19) – convolution neural network to identify the spirit the characters in the cage | 7 days
- Depth study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day
From the column:100 Examples of Deep Learning
1. Set the GPU (skip this step if you are using a CPU)
import tensorflow as tf
gpus = tf.config.list_physical_devices("GPU")
if gpus:
gpu0 = gpus[0] # If there are multiple Gpus, use only the 0th GPU
tf.config.experimental.set_memory_growth(gpu0, True) Set GPU memory usage as required
tf.config.set_visible_devices([gpu0],"GPU")
Copy the code
2. Import data
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()
Copy the code
3. The normalization
Normalize the values of pixels to the range from 0 to 1.
train_images, test_images = train_images / 255.0, test_images / 255.0
train_images.shape,test_images.shape,train_labels.shape,test_labels.shape
"" "output: ((60000, 28, 28), (10000), 28, 28), (60000), (10000) ", ""
Copy the code
4. Visualization
plt.figure(figsize=(20.10))
for i in range(20):
plt.subplot(5.10,i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(train_images[i], cmap=plt.cm.binary)
plt.xlabel(train_labels[i])
plt.show()
Copy the code
5. Reformat the image
# Adjust the data to the format we need
train_images = train_images.reshape((60000.28.28.1))
test_images = test_images.reshape((10000.28.28.1))
train_images, test_images = train_images / 255.0, test_images / 255.0
train_images.shape,test_images.shape,train_labels.shape,test_labels.shape
"" "output: ((60000, 28, 28, 1), (10000, 28, 28, 1), (60000), (10000) ", ""
Copy the code
2. Build CNN network model
model = models.Sequential([
layers.Conv2D(32, (3.3), activation='relu', input_shape=(28.28.1)),# convolution layer 1, convolution kernel 3*3
layers.MaxPooling2D((2.2)), Pooling layer 1,2 *2 sampling
layers.Conv2D(64, (3.3), activation='relu'), # convolution layer 2, convolution kernel 3*3
layers.MaxPooling2D((2.2)), # Pool layer 2, 2*2 sampling
layers.Flatten(), #Flatten layer, connecting the convolution layer and the full connection layer
layers.Dense(64, activation='relu'), # Full connection layer, further feature extraction
layers.Dense(10) # output layer, output expected results
])
Print the network structure
model.summary()
Copy the code
Compile the model
"" The optimizer, loss function, and metrics are all set here. See my blog: https://blog.csdn.net/qq_38251616/category_10258234.html ""
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
Copy the code
4. Training model
Epochs The input training dataset (images and labels), validation dataset (images and labels), and number of iterations epochs https://blog.csdn.net/qq_38251616/category_10258234.html """
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
Copy the code
Five, the prediction
Through the following network structure, it can be simply understood as: input a picture, and a group of numbers will be obtained, which represents the probability of each number in the picture being 0~9. The larger the out number is, the more likely it will be.
plt.imshow(test_images[1])
Copy the code
Outputs the predicted results for the first image in the test set
pre = model.predict(test_images)
pre[1]
Copy the code
Six, knowledge point detailed explanation
Lenet-5, the simplest CNN model, is used in this paper. If you are first exposed to deep learning, you can first try to run through the code, and then try to understand the code.
1. MNIST handwritten digital data set introduction
MNIST handwritten digital dataset is sourced from the National Institute of Standards and Technology and is one of the famous public datasets. Data set of digital image is composed of 250 people of different professional pure hand draw, the data set to obtain url is: yann.lecun.com/exdb/mnist/ (need) decompression after downloading. (train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()
MNIST handwritten digital data set contains 70,000 images, including 60,000 training data, 10,000 test data, and 70,000 images are all 28*28. The sample data set is as follows:
If we convert the pixels in each picture into vectors, we get length28 * 28 = 784
The vector. So we can view the training set as one[60000784]
The first dimension represents the index of the picture and the second dimension represents the pixels in each picture. And each pixel in the image has a value between0-1
In between.
2. Neural network program description
The neural network program can be briefly summarized as follows:
3. Network structure description
Structure of the model
The role of each layer
- Input layer: Used to enter data into the training network
- Convolution layer: use convolution kernel to extract image features
- Pooling layer: Down-sampling is performed to represent image features with a higher level of abstraction
- Flatten layer: One-dimensional input is commonly used in the transition from convolution layer to fully connected layer
- Full connection layer: play the role of “feature extractor”
- Output layer: Output results
Recommended reading:
- Depth study of 100 cases (VGG – 19) – convolution neural network to identify the spirit the characters in the cage | 7 days
- Depth study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day
From the column:100 Examples of Deep Learning