Public account: You and the cabin by: Peter Editor: Peter
Hello, I’m Peter
This paper records the first application of convolutional neural network in the field of image recognition: cat and dog image recognition. The main contents include:
- The data processing
- Neural network model building
- Data enhancement implementation
The deep learning framework used in this article is Keras;
Image data from kaggle official website: www.kaggle.com/c/dogs-vs-c…
The data processing
The amount of data
The dataset contains 25,000 images, 12,500 each for cats and dogs; Create a training set of 1000 samples per category, a validation set of 500 samples, and a test set of 500 samples
Note: Only part of the data is taken out for modeling
Create a directory
In [1]:
import os, shutil
Copy the code
In [2]:
current_dir = ! PWD # Current directory current_dir[0]Copy the code
Out[2]:
'/Users/peter/Desktop/kaggle/kaggle_12_dogs&cats/dogs-vs-cats'
Copy the code
Create a new directory to store the required data set:
base_dir = current_dir[0] + '/cats_dogs_small'
os.mkdir(base_dir) Create a directory
Copy the code
Create directories for training set, validation set, and test set respectively
train_dir = os.path.join(base_dir,"train")
os.mkdir(train_dir)
validation_dir = os.path.join(base_dir,"validation")
os.mkdir(validation_dir)
test_dir = os.path.join(base_dir,"test")
os.mkdir(test_dir)
# Cat, dog training, verification, testing image directory
train_cats_dir = os.path.join(train_dir, "cats")
os.mkdir(train_cats_dir)
train_dogs_dir = os.path.join(train_dir, "dogs")
os.mkdir(train_dogs_dir)
validation_cats_dir = os.path.join(validation_dir, "cats")
os.mkdir(validation_cats_dir)
validation_dogs_dir = os.path.join(validation_dir, "dogs")
os.mkdir(validation_dogs_dir)
test_cats_dir = os.path.join(test_dir, "cats")
os.mkdir(test_cats_dir)
test_dogs_dir = os.path.join(test_dir, "dogs")
os.mkdir(test_dogs_dir)
Copy the code
Data set replication
In [5]:
Train fnames = ['cat.{}.jpg'. Format (I) for I in range(1000)] for fname in fnames: Join (current_dir[0] + "/train", fname) fname) shutil.copyfile(src, dst)Copy the code
In [6]:
Valiation fnames = ['cat.{}.jpg'. Format (I) for I in range(1000,1500)] for fnames in fnames: src = os.path.join(current_dir[0] + "/train", fname) dst = os.path.join(validation_cats_dir, fname) shutil.copyfile(src, dst)Copy the code
In [7]:
Fnames = ['cat.{}.jpg'. Format (I) for I in range(1500,2000)] for fnames in fnames: src = os.path.join(current_dir[0] + "/train", fname) dst = os.path.join(test_cats_dir, fname) shutil.copyfile(src, dst)Copy the code
In [8]:
# Do the same for dog
# 1, 1000 as a training set train
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
src = os.path.join(current_dir[0] + "/train", fname)
dst = os.path.join(train_dogs_dir, fname)
shutil.copyfile(src, dst)
# 2. 500 as valiation
fnames = ['dog.{}.jpg'.format(i) for i in range(1000.1500)]
for fname in fnames:
src = os.path.join(current_dir[0] + "/train", fname)
dst = os.path.join(validation_dogs_dir, fname)
shutil.copyfile(src, dst)
# 3. 500 copies as test set
fnames = ['dog.{}.jpg'.format(i) for i in range(1500.2000)]
for fname in fnames:
src = os.path.join(current_dir[0] + "/train", fname)
dst = os.path.join(test_dogs_dir, fname)
shutil.copyfile(src, dst)
Copy the code
Check the data
Check how many images are contained in each set (training, verification, test) for the cat and dog categories:
Building a neural network
To review the structure of convolutional neural network: Conv2D layer (using relu activation function) + MaxPooling2D layer stacked alternately.
When larger images and more complex problems are required, a Conv2D layer (using the RELu activation function) + MaxPooling2D layer is added.
The benefits of this:
- Increase network capacity
- Reduce the size of the feature map
Note: Cat and dog classification is a dichotomous problem, so the last layer of the network is a single unit (size 1 Dense layer) activated with SIGmoID
In the network, the depth of feature graph increases gradually (from 32 to 128), but the size of feature graph decreases gradually (from 150-150 to 7-7).
- Increased depth: The original image is more complex and requires more filters
- Size reduction: More convolution and pooling layers compress and abstract the image continuously
The network structures,
In [15]:
import tensorflow as tf
from keras import layers
from keras import models
model = models.Sequential()
model.add(tf.keras.layers.Conv2D(32, (3.3),activation="relu",
input_shape=(150.150.3)))
model.add(tf.keras.layers.MaxPooling2D((2.2)))
model.add(tf.keras.layers.Conv2D(64, (3.3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2.2)))
model.add(tf.keras.layers.Conv2D(128, (3.3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2.2)))
model.add(tf.keras.layers.Conv2D(128, (3.3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2.2))) #
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, activation="relu"))
model.add(tf.keras.layers.Dense(1, activation="sigmoid"))
model.summary()
Copy the code
Model compilation (optimization)
The last layer of the network is a single sigmoID cell, which uses binary cross entropy as the loss function
In [16]:
# 翻 译 : From keras import optimizers
from tensorflow.keras import optimizers
model.compile(loss="binary_crossentropy",
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=["acc"])
Copy the code
Data preprocessing
Data must be converted into floating point tensors before being fed into the neural network.
Keras a image processing module: keras. Preprocessing, image.
It includes the ImageDataGenerator class, which allows you to quickly create Python generators that process graphics files into tensor batches
How to understand generators in Python?
Data preprocessing
- Read the file
- Convert files from JPEG files to RGB pixel networks
- The grid of pixels is transformed into a floating point tensor
In [18]:
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255) # zoom
test_datagen = ImageDataGenerator(rescale=1./255) # zoom
train_generator = train_datagen.flow_from_directory(
train_dir, # Directory to be processed
target_size=(150.150), # Image size Settings
batch_size=20,
class_mode="binary" The # loss function is binary_crossentropy so use binary tags
)
validation_generator = test_datagen.flow_from_directory(
validation_dir, # Directory to be processed
target_size=(150.150), # Image size Settings
batch_size=20,
class_mode="binary" The # loss function is binary_crossentropy so use binary tags
)
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Copy the code
In [19]:
for data_batch, labels_batch in train_generator:
print(data_batch.shape)
print(labels_batch.shape)
break
(20.150.150.3)
(20.)Copy the code
The output of the generator is a batch of 150-150 RGB images and binary labels with shapes of (20,). Each batch contains 20 samples (the size of the batch).
The generator keeps generating these batches, constantly looping images in the target folder.
The KERAS model uses the FIT_generator method to fit the effects of the generator. The model has a parameter, the STEps_per_EPOCH parameter: After the steps_per_EPOCH batch is extracted from the generator, the fitting goes to the next round.
In this example: there are 2000 samples in total, and each batch is 20 samples, so 100 batches are needed
The model fitting
In [20]:
history = model.fit_generator(
train_generator, The first argument must be a Python generator
steps_per_epoch=100.# 2000/20
epochs=30.# number of iterations
validation_data=validation_generator, # Data set to be validated
validation_steps=50
)
Copy the code
Save the model
In [21]:
Save ("cats_and_dogs_small.h5")Copy the code
Loss and accuracy curves
In [22]:
import matplotlib.pyplot as plt
%matplotlib inline
Copy the code
In [23]:
history_dict = history.history # dictionary form
for key, _ in history_dict.items():
print(key)
loss
acc
val_loss
val_acc
Copy the code
In [24]:
acc = history_dict["acc"]
val_acc = history_dict["val_acc"]
loss = history_dict["loss"]
val_loss = history_dict["val_loss"]
Copy the code
In [25]:
epochs = range(1.len(acc)+1)
# acc
plt.plot(epochs, acc, "bo", label="Training acc")
plt.plot(epochs, val_acc, "b", label="Validation acc")
plt.title("Training and Validation acc")
plt.legend()
plt.figure()
# loss
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and Validation loss")
plt.legend()
Copy the code
Summary: the conclusion of overfitting is obtained
- Over time, training accuracy continues to increase, approaching 100%, while verification accuracy stays at 70%
- The verification loss reached its minimum after round 6 and remained constant for a certain number of rounds, while the training loss continued to decrease and approached zero
Data augmentation – Data augmentation
What is data enhancement
Data enhancement is also a method to solve overfitting. The other two methods are:
- dropout
- Weight attenuation regularization
What is data enhancement: Generate more training data from existing training samples and increase the data samples by using various random variations that can generate trusted images.
The model does not look at two identical images during training
Setting data Enhancement
In [26]:
datagen = ImageDataGenerator(
rotation_range=40.# Angle value 0-180
width_shift_range=0.2.# Range of horizontal and vertical directions; Proportion to the total width or height
height_shift_range=0.2,
shear_range=0.2.# Angle of random tangent transformation
zoom_range=0.2.# Random zoom Angle of the image
horizontal_flip=True.# Flip half the image horizontally at random
fill_mode="nearest" The method used to fill newly created pixels
)
Copy the code
Display the enhanced image
In [27]:
from keras.preprocessing import image
fnames = [os.path.join(train_cats_dir,fname) for fname in os.listdir(train_cats_dir)]
img_path = fnames[3]
Copy the code
In [28]:
# Read the image and resize it
img = image.load_img(img_path, target_size=(150.150))
# convert to array
x = image.img_to_array(img)
# shape into (1,150,150,3)
x = x.reshape((1,) + x.shape)
i = 0
for batch in datagen.flow(x, batch_size=1) :# Generate image batch after random transformation
plt.figure()
imgplot = plt.imshow(image.array_to_img(batch[0]))
i += 1
if i % 4= =0:
break # The loop is infinite and needs to end at some point
plt.show()
Copy the code
A new convolutional neural network with Dropout layer
Data enhancement to train the network so that it does not see the same input twice. But the inputs are still highly correlated, and overfitting cannot be completely eliminated.
Consider adding a Dropout layer that precedes the dense sorting connector
In [29]:
import tensorflow as tf
from keras import layers
from keras import models
model = models.Sequential()
model.add(tf.keras.layers.Conv2D(32, (3.3),activation="relu",
input_shape=(150.150.3)))
model.add(tf.keras.layers.MaxPooling2D((2.2)))
model.add(tf.keras.layers.Conv2D(64, (3.3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2.2)))
model.add(tf.keras.layers.Conv2D(128, (3.3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2.2)))
model.add(tf.keras.layers.Conv2D(128, (3.3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2.2))) #
model.add(tf.keras.layers.Flatten())
# Add content
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(512, activation="relu"))
model.add(tf.keras.layers.Dense(1, activation="sigmoid"))
model.compile(loss="binary_crossentropy",
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=["acc"])
Copy the code
Training convolutional Neural Network using Data enhancer (Error Resolution)
About error resolution: we have 2000 training images, 1000 verification images, and 1000 test images.
- Steps_per_epoch =100, batch_size=32, so the data should be 3200, obviously the input training data is not enough.
- Validation_steps =50, batch_size=32, so the data should be 1600, obviously not enough validation data.
Therefore, steps_per_EPOCH =2000/32≈63, validation_steps=1000/32≈32.
In [44]:
# Enhancement of training data
train_datagen = ImageDataGenerator(
rescale=1. / 255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True
)
Validation data cannot be enhanced
test_datagen = ImageDataGenerator(rescale=1.0 / 255)
train_generator = train_datagen.flow_from_directory(
train_dir, # target directory
target_size=(150.150), # resize
batch_size=32,
class_mode="binary"
)
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150.150),
batch_size=32,
class_mode="binary"
)
# optimizer: Error reported with modifications
history = model.fit_generator(
train_generator,
The original # steps_per_epoch = 100,
steps_per_epoch=63.Steps_per_epoch =2000/32≈63
epochs=100,
validation_data=validation_generator,
The original # validation_steps = 50
validation_steps=32 Material 32 # validation_steps = 1000/32
)
Copy the code
Model preservation:
# Save the model
model.save("cats_and_dogs_small_2.h5")
Copy the code
Loss and accuracy curves
In [46]:
history_dict = history.history # dictionary form
acc = history_dict["acc"]
val_acc = history_dict["val_acc"]
loss = history_dict["loss"]
val_loss = history_dict["val_loss"]
Copy the code
Specific drawing code:
epochs = range(1.len(acc)+1)
# acc
plt.plot(epochs, acc, "bo", label="Training acc")
plt.plot(epochs, val_acc, "b", label="Validation acc")
plt.title("Training and Validation acc")
plt.legend()
plt.figure()
# loss
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and Validation loss")
plt.legend()
plt.show()
Copy the code
Conclusion: After using data enhancement, the model no longer fits, and the training set curve closely follows the verification curve. The accuracy is 81%, which is better than before.