This article has participated in the call for good writing activities, click to view: back end, big front end double track submission, 20,000 yuan prize pool waiting for you to challenge!


Following the article, this time the entry-level CNN convolutional neural network will be used to complete price recognition. (In order to mirror the above, make one last clickbait 🥺)

1 analysis

The original picture has been obtained, and then the picture is processed, and then cut. As raw material for machine learning.

Since the image is in PNG format, it is usually 4 channels (RGB + transparency).

General processing process:

1. Get the original picture:

4 channels (RGB + transparency)

2 To grayscale image: single channel, pixel value 0-255

Gray conversion formula: L = R * 299/1000 + G * 587/1000 + B * 114/1000

3 gray image binarization: in fact, the image pixel value is converted to 0 or 1

[0 if _ < 200 else 1] [0 if _ < 200 else 1]

If the data is complex, it also involves border removal, edge detection, tilt correction, cutting, noise reduction (corrosion, swelling), etc.

This data is relatively simple, converted to binary data can be used directly.

2 identify

2.1 Cutting pictures

Cut key code:

lines = [-281.16, -249.92, -218.68, -187.44, -156.2, -124.96, -93.72, -62.48, -31.24, -0.0]
lines_step = 22
lines_map = {
    '281.16': 336.'249.92': 299.'218.68': 261.'187.44': 223.'156.2': 187.'124.96': 149.'93.72': 112.'62.48': 74.'31.24': 38.'0.0': 1,
}
idx = 1


def process_img(imgpath: str) :
    global idx
    # Original image
    img = Image.open(imgpath)
    width, height = img.size
    img2 = copy.deepcopy(img)
    img_arr = np.array(img)
    print(img_arr.shape)
    # to turn gray
    L = R * 299/1000 + G * 587/1000 + B * 114/1000 ≈ 361
    img_gray = img.convert('L')
    img_gray_arr = np.array(img_gray)
    print(img_gray_arr.shape)
    for data in img_gray_arr:
        pass
        # print(''.join(['{:03}'.format(_) for _ in data]))
        # print(''.join(['{:03}'.format(_) if _ != 0 else '...' for _ in data]))
    # binarization
    img_bin = img_gray.point([0 if _ < 128 else 1 for _ in range(256)].'1')
    img_bin_arr = np.array(img_bin)
    print(img_bin_arr.shape)
    for data in img_bin_arr:
        pass
        # print(''.join(['1' if _ else '0' for _ in data]))
        # print(''.join(['X' if _ else '.' for _ in data]))
    # Image processing
    img_draw = ImageDraw.Draw(img2)
    for line in lines:
        new_line = lines_map.get(str(line))
        p1 = (new_line, 1)
        p2 = (new_line+22, height-1)
        # Photo circle
        img_draw.rectangle((p1, p2), outline='red')
        # Image cropping
        img_crop = img_bin.crop((new_line, 0, new_line+22, height))
        img_crop.save(os.path.join('imgs_crop'.'{:03}.png'.format(idx)))
        idx += 1
    plt.imshow(img2)
    plt.show()
Copy the code

Pictures after cutting:

The images are then manually sorted and placed into folders named by numbers. Complete manual annotation.

2.2 Recognition training

Python3 Keras + TensorFlow is used primarily to do this.

Examples of model code:

def gen_model() :
    """ Build model :return: model """
    _model = Sequential([
        # convolution layer
        # 36 is the output dimension, that is, the number of convolution kernels
        # kernel_size is the size of the convolution kernel
        Conv2D(36, kernel_size=3, padding='same', activation='relu', input_shape=(36.22.1)),
        # maximum pooling layer
        MaxPooling2D(pool_size=(2.2)),
        Dropout # Dropout involves setting the input unit's ratio randomly to 0 at each update during training, which helps prevent overfitting.
        Dropout(0.25),
        # convolution layer
        Conv2D(64, kernel_size=3, padding='same', activation='relu', input_shape=(36.36.1)),
        # maximum pooling layer
        MaxPooling2D(pool_size=(2.2)),
        #
        Dropout(0.25),
        Flatten the input to turn multidimensional data into one-dimensional data
        Flatten(),

        Full connection layer
        Dense(512, activation='relu'),
        Dropout(0.5),
        Dense(10, activation='softmax'),])return _model
Copy the code

Examples of training code:

def train() :
    model = gen_model()
    model.summary()
    # model compilation
    Optimizer model
    # Loss function name, objective function
    # metrics includes metrics that evaluate the network performance of the model during training and testing
    model.compile(optimizer='adam'.# keras.optimizers.Adadelta()
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy']
                  )

    x_train, y_train = load_data()
    x_train = x_train.reshape(-1.36.22.1)
    x_test, y_test = load_test_data()
    x_test = x_test.reshape(-1.36.22.1)

    # callbacks=tensorboard monitor
    Conduct training evaluations
    # x_train input data
    # y_train label
    # batCH_size gradient drops when each batch contains the number of samples. When training, a batch of samples will be calculated once for gradient descent, making the objective function optimized one step.
    Epochs is an integer, the number of training rounds. Each epoch will take the training set one time.
    # verbose Displays that 0 indicates that log information is not output in the standard output stream, 1 indicates that progress bar records are output, and 2 indicates that each epoch is output one line
    # validation_data validates the dataset
    history = model.fit(x_train, y_train, batch_size=32, epochs=20, verbose=1, validation_data=(x_test, y_test),)

    # Epochs data set all the samples run once times with the number of batCH_size in a group for training and weight adjustment
    score = model.evaluate(x_test, y_test, verbose=0)
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])

    Draw the accuracy value of training set and test set during training
    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])
    plt.title('Model accuracy')
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Train'.'Test'], loc='upper left')
    plt.show()

    Draw the loss values of training set and test set during training
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('Model loss')
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Train'.'Test'], loc='upper left')
    plt.show()

    model.save('model/ziru.h5')
Copy the code

Sample code for training data generation:

Main points: train_lable and train_data. Lable is the corresponding data label, that is, the value to be identified as. Data is the specific data value of the corresponding data.

def gen_train_data(parent_path: str) :
    train_data = []
    train_label = []
    for idx in range(10):
        cur_path = os.path.join(parent_path, str(idx))
        for dirpath, dirnames, filenames in os.walk(cur_path):
            for filename in filenames:
                if filename.endswith('png'):
                    imgpath = os.path.join(cur_path, filename)
                    label = imgpath.split('/') [1]
                    data = np.array(Image.open(imgpath))
                    train_label.append(int(label))
                    train_data.append(data)
    return np.array(train_data), np.array(train_label)
Copy the code

The training process is as follows:

Because the picture is relatively simple, simple training can basically reach 100% recognition.

Epoch 1/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 1 68 ms/s step - loss: 2.0173 accuracy: 0.3350 - val_loss: 1.3893 - val_accuracy: 0.7950 Epoch 2/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 43 ms/step - - 0 s loss: 1.1314 accuracy: 0.6900 - val_loss: 0.5309 - val_accuracy: 1.0000 Epoch 3/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 36 ms/step - loss: 0.5474-accuracy: 0.8100-val_loss: 0.1853-val_accuracy: 1.0000 Epoch 4/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 36 ms/step - loss: 0.2606 accuracy: 0.9250 - val_loss: 0.0842 - val_accuracy: 1.0000 Epoch 5/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 34 ms/step - loss: 0.2730 accuracy: 0.9250 - val_loss: 0.1025 - val_accuracy: 0.9700 Epoch 6/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 37 ms/step - loss: 0.1857-accuracy: 0.9300-val_loss: 0.0365-val_accuracy: 1.0000 Epoch 7/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 0 s 35 ms/step - loss: 0.0952 accuracy: 0.9800 - val_loss: 0.0165 - val_accuracy: 1.0000 Epoch 8/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 0 s 35 ms/step - loss: 0.0560 accuracy: 0.9900 - val_loss: 0.0076 - val_accuracy: 1.0000 Epoch 9/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 0 s 35 ms/step - loss: 0.0125-accuracy: 1.0000-val_loss: 0.0066-val_accuracy: 1.0000 Epoch 10/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 36 ms/step - loss: 0.0173 accuracy: 1.0000 - val_loss: 0.0024 - val_accuracy: 1.0000 Epoch 11/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 34 ms/step - loss: 0.0086-accuracy: 1.0000-val_loss: 0.0014-val_accuracy: 1.0000 Epoch 12/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 37 ms/step - loss: 0.0061 accuracy: 1.0000 - val_loss: E-04 val_accuracy - 8.3420:1.0000 Epoch 13/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 33 ms/step - loss: 0.0051-accuracy: 1.0000-val_loss: 4.9917E-04-val_accuracy: 1.0000 Epoch 14/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 0 s 35 ms/step - loss: 0.0020 accuracy: 1.0000 - val_loss: E-04 val_accuracy - 3.4299:1.0000 Epoch 15/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 0 s 35 ms/step - loss: 0.0037-accuracy: 1.0000-val_loss: 2.3839E-04-val_accuracy: 1.0000 Epoch 16/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 34 ms/step - loss: 0.0028 accuracy: 1.0000 - val_loss: E-04 val_accuracy - 2.0110:1.0000 Epoch 17/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 36 ms/step - loss: 0.0012-accuracy: 1.0000-val_loss: 1.8016E-04-val_accuracy: 1.0000 Epoch 18/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 0 s 35 ms/step - loss: 0.0015 accuracy: 1.0000 - val_loss: E-04 val_accuracy - 1.5284:1.0000 Epoch 19/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 38 ms/step - loss: 8.4545E-04-accuracy: 1.0000-val_loss: 1.3383E-04-val_accuracy: 1.0000 Epoch 20/20 7/7 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 0 s 36 ms/step - loss: 7.2767 e-04 - accuracy: 1.0000-val_loss: 1.2135E-04-VAL_accuracy: 1.0000 Test Loss: 0.00012135423457948491 Test accuracy: 1.0Copy the code

Training loss and accuracy chart:

2.3 Identification and Verification

Load the model, pass in the data, and get the recognition result.

Sample code:

def __recognize_img(img_data) :
    model = load_model('model/ziru.h5')
    img_arr = np.array(img_data)
    img_arr = img_arr.reshape((-1.36.22.1))
    result = model.predict(img_arr)
    predict_val = __parse_result(result)
    return predict_val


def __parse_result(result) :
    result = result[0]
    max_val = max(result)
    for i in range(10) :if max_val == result[i]:
            return i
Copy the code

3 packages

After completing the identification process, all that remains is to encapsulate and expose the service.

For convenience, has made the interface services: testing interface = = > https://lemon.lpe234.xyz/common/ziru/

4 summarizes

The use of CNN in this article is basically at the entry level. In fact, digital recognition can also be recognized by key pixel points. For example, there must be a difference between 1 and 3 pictures, and the difference can be recognized basically.