Master deep learning, insufficient data can also carry out image classification

The paper contains 4496 words and is expected to last 13 minutes

Photo source: Unsplash

In machine learning, the deciding factor is often not the algorithm but the size of the data set. – Wu En da

Image classification refers to setting labels on input images according to fixed categories. Although computer vision is too simple, it is still widely used in practice, and image classification is one of the core problems.

In this article, Small Chip will demonstrate how to apply deep learning when data is scarce. Two datasets have been created for the specialized car and bus classifiers, each containing 100 images. There are 70 images in the training set and 30 images in the verification set.

challenge

1. Viewpoint variation: Based on the camera, a single object instance can be focused from multiple perspectives.

2. Scale variation: Visual classification often has scale variation (scale here refers to the size of objects in the real world, not just in terms of images).

3. Image deformation: Many target objects are not rigid bodies (refers to objects whose shape and size remain unchanged and the relative positions of internal points remain unchanged after motion and force action), which may cause extreme deformation.

4. Image occlusion: The target object may be occluded, so sometimes only a small part of the object will be shown (at least a few pixels).

5. Lighting conditions: The lighting effect is very noticeable at the pixel level.

Cat vs dog image classification

application

1. Photo galleries and video sites: Drive billions of searches on photo sites every day. Provide tools for users to find visual content through search.

2. Provide visual search to improve product searchability: with visual search, users can search for similar images or products by taking images taken by themselves or images downloaded from the Internet as reference.

3. Security: This emerging technology is a big part of the security industry, already being used to develop a variety of security devices, including drones, surveillance cameras and biometric recognition devices for facial recognition.

4. Healthcare: In healthcare, robot-driven microsurgery uses computer vision and image recognition.

5. Automobile industry: This technology can reduce road traffic accidents, promote people to obey traffic rules, maintain traffic order and so on.

Data volume function model performance

Environment and Tools:

1. Matplotlib

2. keras

data

This is a binary classification problem. Microchip downloaded 200 images, 100 of which were bus images and the rest were car images. The data is decomposed as follows:

dataset train

car

car1.jpg

car2.jpg

bus

bus1.jpg

bus2.jpg

// validation

car

car1.jpg

car2.jpg

bus

bus1.jpg

bus2.jpg

/ /…

The car

The bus

Image classification

The formal and complete image classification path is as follows:

· Input a training set containing N images, each marked by a single category, with two different categories in total.

· Then, the training set is used to train the classifier to recognize the features of each category.

· Finally, the classifier is required to make label prediction for new images that have never been contacted before, and then compare the actual labels of these images with those predicted by the classifier to evaluate the performance of the classifier.

The first is coding.

Start by loading keras and its layers, which will be used for later model building.

from keras.models import Sequential

from keras.layers import Convolution2D

from keras.layers import MaxPooling2D

from keras.layers import Flatten

from keras.layers import Dense

Viewrawimports6.py hostedwith ❤ By GitHub

Next, build the model, which can be divided into three steps.

1. Two convolution blocks composed of the convolution layer and the maximum pooling layer are used, and the linear rectification unit (ReLU) is used as the activation function of the convolution layer.

2. Flatten layer is used at the top, and two fully connected layers with linear rectifier function and S-type function as activation function are respectively below.

3. Use the Adam optimizer, cross-entropy as the loss function.

classifier = Sequential()

# Step 1 – Convolution

classifier.add(Convolution2D(32, 3, 3, input_shape= (64, 64, 3), activation=’relu’))

# Step 2 – Pooling

classifier.add(MaxPooling2D(pool_size= (2, 2)))

# Adding a second convolutional layer

classifier.add(Convolution2D(32, 3, 3, activation=’relu’))

classifier.add(MaxPooling2D(pool_size= (2, 2)))

# Step 3 – Flattening

classifier.add(Flatten())

# Step 4 – Full connection

classifier.add(Dense(output_dim=128, activation=’relu’))

classifier.add(Dense(output_dim=1, activation=’sigmoid’))

# Compiling the CNN

classifier.compile(optimizer=’adam’, loss=’binary_crossentropy’, metrics= [‘accuracy’])

Viewrawmodel2.py hostedwith ❤ By GitHub

Data to enhance

Data enhancement is an effective way to enlarge the scale of training set. Enhancing the training sample can make the network obtain more diversified but still representative data points in training.

The following code defines a set of enhancement actions for the training set: rotate, pan, cut, flip, and scale.

If the data set is too small, additional training data should be created using data enhancement.

At the same time, Small Chip created a data generator to automatically retrieve data from folders and transfer it to Keras. Keras provides convenient Python generator functions for this purpose.

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1./255,

Shear_range = 0.2,

Zoom_range = 0.2,

horizontal_flip=True,

Width_shift_range = 0.2,

Height_shift_range = 0.2,

rotation_range=15,

vertical_flip=True,

fill_mode=’reflect’,

data_format=’channels_last’,

Brightness_range = [0.5, 1.5].

featurewise_center=True,

featurewise_std_normalization=True)

test_datagen = ImageDataGenerator(rescale=1./255)

training_set = train_datagen.flow_from_directory(‘dataset/train’,

target_size= (64, 64),

batch_size=32,

class_mode=’binary’)

test_set = test_datagen.flow_from_directory(‘dataset/validation’,

target_size= (64, 64),

batch_size=32,

class_mode=’binary’)

Viewrawsplit_data.py hostedwith ❤ By GitHub

Subsequently, the model trained 50 epochs with 32 batches per epoch.

Batch size is one of the most important super parameters in deep learning. Small cores are more accustomed to using larger Batch sizes for model training, because this allows gpu parallelism while speeding up computation.

However, it is well known that too large a Batch size results in poor generalization.

Photo source: Unsplash

On the one hand, using batch data equivalent to the size of the whole data set can ensure convergence to achieve global optimization of the objective function. However, this will lead to the decrease of optimal convergence speed.

On the other hand, smaller Batch sizes have been shown to converge more quickly towards good results. This is intuitively illustrated by the fact that the small Batch size allows the model to start learning before all the data is retrieved.

However, its disadvantage is that it cannot guarantee global optimality. Therefore, it is generally recommended that you first benefit from faster training dynamics with small batch processing and then gradually increase batch size through training.

history = classifier.fit_generator(training_set,

samples_per_epoch=128,

nb_epoch=50,

validation_data= test_set,

nb_val_samples=59)

Viewrawtrain4.py hostedwith ❤ By GitHub

Visualize and map the losses accurately.

import matplotlib.pyplot as plt

fig = plt.figure()

plt.plot(history.history[‘val_loss’])

plt.legend([‘validation’], loc=’upper left’)

plt.title(‘validation loss vs epoch’)

plt.ylabel(‘validation loss’)

plt.xlabel(‘Epoch’)

Viewrawval_loss.py hostedwith ❤ By GitHub

Verify the loss VS EPOCH

import matplotlib.pyplot as plt

fig = plt.figure()

plt.plot(history.history[‘val_acc’])

plt.legend([‘validation’], loc=’upper left’)

plt.title(‘validation accuracy vs epoch’)

plt.ylabel(‘validation accuracy’)

plt.xlabel(‘Epoch’)

Viewrawval_acc.py hostedWith ❤ By GitHub

The model can achieve 100% validation accuracy after running 50 epochs.

conclusion

Therefore, deep learning can be carried out even when there is insufficient data.

This model can achieve 100% validation accuracy in 50 epochs with only 100 images of various categories.

This model can also be extended to solve other binary or multi-level image classification problems.

One might think this model is fairly simple, since it is easy to see with the naked eye that a car is very different from a bus. So can we use this model to develop a classifier to identify benign/malignant tumors?

The answer is yes.

We can develop such a classifier, but the key is to use data enhancement for any small data set. Another solution is to use pre-training weights to achieve transfer learning.

Now, do you understand?

Leave a comment like follow

We share the dry goods of AI learning and development. Welcome to pay attention to the “core reading technology” of AI vertical we-media on the whole platform.

(Add wechat: DXSXBB, join readers’ circle and discuss the freshest artificial intelligence technology.)

Master deep learning, insufficient data can also carry out image classification

Related Posts

Text classification remains BERT? The dual contrast learning framework is also too strong

The first prize team of Haihua Competition talked about the competition experience and experience: the application of AI in garbage sorting

2020 Double 11: Look at the black technology behind Alibaba!