Note: Do not use for commercial purposes

Python environment: Python3.5

Section one: Preparation

1. Introduction

The project is based on Python +CNN+Tensorflow. Tensorflow CPU version is used in model training. As long as your machine memory is more than 8G, you can replace the training sample with your own sample as described in the article and simply modify a few parameters of the model to train a desired model.

2. Common character verification codes

The captcha images shown above do not represent any actual websites, and any similarity is purely coincidental. The project can only be used for learning and communication purposes, not for illegal purposes. The common verification rules of the captcha are the characters in the input picture, or the specified color characters in the input color characters.

3. Obtaining training samples

The effect of recognition model is directly related to the quality and quantity of training samples. According to my own training experience of several verification code recognition models at present, a model with 80% accuracy can be obtained as long as the number of training samples exceeds 10,000 for verification code recognition in case + number format. If according to the characteristics of the specific verification code through image processing to do some simple processing in training and recognition, the sample size can also be less. The fifth and sixth captcha in the figure above can be identified with 90% accuracy with 2000 samples after certain processing.

3.1 Manual coding

Manual coding is the most common method. At present, there are many companies on the network that provide coding services. Based on their services, we can mark our samples in batches. However, the problems are also obvious. Firstly, the coding platform needs to pay for the data and it costs a certain amount. Secondly, there are some wrong data in the labeled data, which has a certain influence on the final effect of the model. False labeling can be avoided based on certain logic, such as verifying that the labeled data is correct, which you can imagine.

3.2 Simulation generation

Analyze the characteristics of the verification code, through the program simulation to generate similar or even exactly the same verification code; Technical requirements are high, but group free training samples can be obtained

3.3 Generating a Verification Code Based on Python

Actual website verification code (source network)

Code generation captcha

Code generation captcha notes:

Font file: To find the font library used for the actual image
Font size: Determine font size by observing comparisons
Interference addition: analyze the interference in the picture and simulate the generation of interference

The python code used to generate:

Other types of verification codes can be modified according to the following code to generate the corresponding format of verification codes

#! /usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2019/10/12 10:01 # @Author : shm # @Site : # @File : create_yzm.py # @Software: PyCharm import random from PIL import Image,ImageDraw,ImageFont def getRandomColor(): Generate random colors :return: R = random.randint(0,255) g = random.randint(0,255) b = random.randint(0,255) return (r,g,b) def getRandomChar(): Generates random characters :return: ''' charlist = "123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz" random_char = random.choice(charlist) return random_char def genImg(width,height,font_size,chr_num): Param font_size: font size: param chr_num: number of characters :return: param font_size: font size: param chr_num: number of characters :param "#bg_color = getRandomColor() bg_color = (255,255,255) # white background # create a random background color image img = New (mode="RGB",size=(width,height),color=bg_color) # draw = ImageDraw.Draw(img) # change font = font Imagefont. truetype(font="Action Jackson",size=font_size) #font = imagefont. truetype(font=" Chinese color cloud ", size=font_size) for i in range(chr_num): # random_txt = getRandomChar() #txt_color = getRandomColor() txt_color = (0,0,255) # while txt_color == bg_color: # txt_color = getRandomColor() draw. Text ((36+16* I,5),text=random_txt,fill=txt_color,font=font) # txt_color = getRandomColor() draw DrawPoint (draw,width,height) return img def drawLine(draw,width,height): Param draw: :param width: :param height: :return: "for I in range(10): param draw: :param width: :param height: :return:" X1 = random. Randint (0,width) #x2 = random. Randint (0,width-x1) x2 = x1+random. Randint (0,25) y1 = random. height) y2 = y1 #y2 = random.randint(0, height) #draw.line((x1, y1, x2, y2), fill=getRandomColor()) draw.line((x1, y1, X2, y2), def drawPoint(draw,width,height): :param drawPoint: :param drawPoint: :param drawPoint: :param drawPoint: :param drawPoint: :param drawPoint: :param drawPoint: :param ''' for i in range(5): x = random.randint(0, 40) y = random.randint(0, height) #draw.point((x, y), fill=getRandomColor()) draw.point((x, y), Fill =(0,0,255)) def drawOther(draw): "" return:" "pass def genyzm():" "" Chr_num = 4 path = "./ yzm_encpic /" for I in range(10): img = genImg(width,height,font_size,chr_num) dir = path + str(i)+".png" with open(dir,"wb") as fp: img.save(fp,format="png") if __name__=="__main__": try: genyzm() except Exception as e: print(e)Copy the code

Note: Real captcha may avoid some easily confused characters, such as 1, 0, O, Z, 2, etc., so these characters can not be added in the generation of actual samples, so that the label category can be reduced when training the model.

The second Chinese character + number + letter format verification code generation code here don’t stick out directly, this verification code has a web site is in use, in order not to affect the website normal use, here is not open source python code, the overall thinking and similar to the above code, only the background is not a single color, add the Chinese characters.

Section two: model training

2.1 Verification Code Identification Roadmap

The following two verification codes are used as examples First of all, the analysis of whether the verification code can be cut and cut into a single character recognition requires less training samples, and the recognition rate is very easy to do very high. For example, the above two verification codes can be divided into a single character for identification. We only need to simulate the random generation of 2000 verification codes, and after the segmentation, the training samples will be 4*2000 = 8000, and the accuracy can be more than 90%.

2.1.1 Verification Code characteristics analysis

In Figure 1, the size of the verification code is 106X30 and all the characters are concentrated on the right side. When opened by the drawing tool of Windows, the characters are concentrated on the 36-100 area. Therefore, the images in the 36-106 area are first captured, and the size of the images after interception is 64*30.

Main area codes of captured images:

Def screen_shot(SRC,dstpath): return: "" SRC :param dstpath: param dstpath: param dstpath: param dstpath: param dstpath:" " img = Image.open(src) s = os.path.split(src) fn = s[1].split(".") basename = fn[0] ext = fn[-1] box = (36, 0, 100, 30) dstdir = dstpath + basename + "." + ext img.crop(box).save(dstdir) except Exception as e: print("screenshot:",e)Copy the code

Figure 2 Verification code size 100X38 characters evenly distributed, no additional processing is required

2.1.2 Image cutting

The images processed in Figure 1 are evenly cut, and each captCHA image is divided into four small images. The segmentation results are as follows:Figure 2 is directly cut evenly and divided into four separate pictures. The results are as follows:Through observation, it is found that the characters are completely cut into a single character. The size of the picture is 16X30 after it is divided into two parts. The size of the picture is 25*38 after it is divided into two parts

Cut picture code:

Def split_image (SRC, rownum, colnum, dstpath) : "' segmentation image: param SRC: : param rownum: : param colnum: : param dstpath: : return: ''' try: img = Image.open(src) w,h = img.size if rownum <= h and colnum<=w: s = os.path.split(src) fn = s[1].split(".") basename = fn[0] ext = fn[-1] rowheight = h // rownum colwidth = w // colnum  num = 0 for r in range(rownum): for c in range(colnum): name = str(basename[c:c+1]) t = str(int(time.time()*100000)) box = (c*colwidth,r*rowheight,(c+1)*colwidth,(r+1)*rowheight) img.crop(box).save(dstpath+name+"/"+name+"#"+t+"."+ext) num = Num + 1 print(" %s ") else: print(" %s ") except Exception as e: print("e:",e)Copy the code

2.2 Deep learning model

The model is based on AlexNET. For detailed introduction of alexNet model, you can see related articles. The model code is posted here:

This model is the output single character recognition model:

# Copyright 2016 The TensorFlow Authors. All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the  License. # ============================================================================== """Contains a models definition for AlexNet. This work was first described in: ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton and later refined in: One weird trick for parallelizing convolutional neural networks Alex Krizhevsky, 2014 Here we provide the implementation proposed in "One weird trick" and not "ImageNet Classification", as per the paper, the LRN layers have been removed. Usage: with slim.arg_scope(alexnet.alexnet_v2_arg_scope()): outputs, end_points = alexnet.alexnet_v2(inputs) @@alexnet_v2 """ from __future__ import absolute_import from __future__ import division from __future__ import print_function import tensorflow as tf slim = tf.contrib.slim trunc_normal = lambda Truncated_normal_initializer (0.0, stddev) def alexnet_v2_arg_scope(dummy =0.0005): truncated_normal_initializer(0.0, stddev) def alexnet_v2_arg_scope(dummy =0.0005): with slim.arg_scope([slim.conv2d, slim.fully_connected], activation_fn=tf.nn.relu, Biases_initializer = tf. Constant_initializer (0.1), weights_regularizer = slim. L2_regularizer (weight_decay)) : with slim.arg_scope([slim.conv2d], padding='SAME'): with slim.arg_scope([slim.max_pool2d], padding='VALID') as arg_sc: Return arg_sc def alexnet_v2(inputs, num_classes=1000, is_training=True, dropout_keep_prob=0.5, spatial_squeeze=True, The scope = 'alexnet_v2) : "" "AlexNet version 2. Described in: http://arxiv.org/pdf/1404.5997v2.pdf the Parameters from: github.com/akrizhevsky/cuda-convnet2/blob/master/layers/ layers-imagenet-1gpu.cfg Note: All the fully_connected layers have been transformed to conv2d layers. To use in classification mode, resize input to 224x224. To use in fully convolutional mode, set spatial_squeeze to false. The LRN layers have been removed and change the initializers from random_normal_initializer to xavier_initializer. Args: inputs: a tensor of size [batch_size, height, width, channels]. num_classes: number of predicted classes. is_training: whether or not the models is being trained. dropout_keep_prob: the probability that activations are kept in the dropout layers during training. spatial_squeeze: whether or not should squeeze the spatial dimensions of the outputs. Useful to remove unnecessary dimensions for classification. scope: Optional scope for the variables. Returns: the last op containing the log predictions and end_points dict. """ with tf.variable_scope(scope, 'alexnet_v2', [inputs]) as sc: end_points_collection = sc.name + '_end_points' # Collect outputs for conv2d, fully_connected and max_pool2d. with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d], outputs_collections=[end_points_collection]): net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1') net = slim.max_pool2d(net, [3, 3], 2, scope='pool1') net = slim.conv2d(net, 192, [5, 5], scope='conv2') net = slim.max_pool2d(net, [3, 3], 2, scope='pool2') net = slim.conv2d(net, 384, [3, 3], scope='conv3') net = slim.conv2d(net, 384, [3, 3], scope='conv4') net = slim.conv2d(net, 256, [3, 3], scope='conv5') net = slim.max_pool2d(net, [3, 3], 2, scope='pool5') # Use conv2d instead of fully_connected layers. with slim.arg_scope([slim.conv2d], Weights_initializer = trunc_normal (0.005), biases_initializer = tf. Constant_initializer (0.1)) : net = slim.conv2d(net, 4096, [5, 5], padding='VALID', scope='fc6') net = slim.dropout(net, dropout_keep_prob, is_training=is_training, scope='dropout6') net = slim.conv2d(net, 4096, [1, 1], scope='fc7') net = slim.dropout(net, dropout_keep_prob, is_training=is_training, scope='dropout7') net0 = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, normalizer_fn=None, biases_initializer=tf.zeros_initializer(), scope='fc8_0') # Convert end_points_collection into a end_point dict. end_points = slim.utils.convert_collection_to_dict(end_points_collection) if spatial_squeeze: net0 = tf.squeeze(net0, [1, 2], name='fc8_0/squeezed') end_points[sc.name + '/fc8_0'] = net0 return net0, end_points alexnet_v2.default_image_size = 224Copy the code

The model file needs to be adjusted according to the length of the identified captcha:

Output four-character verification code recognition model:

from __future__ import absolute_import from __future__ import division from __future__ import print_function import tensorflow as tf slim = tf.contrib.slim trunc_normal = lambda stddev: Tf. Truncated_normal_initializer (0.0, stddev) def alexnet_v2_arg_scope (weight_decay = 0.0005) : with slim.arg_scope([slim.conv2d, slim.fully_connected], activation_fn=tf.nn.relu, Biases_initializer = tf. Constant_initializer (0.1), weights_regularizer = slim. L2_regularizer (weight_decay)) : with slim.arg_scope([slim.conv2d], padding='SAME'): with slim.arg_scope([slim.max_pool2d], padding='VALID') as arg_sc: Return arg_sc def alexnet_v2(inputs, num_classes=1000, is_training=True, dropout_keep_prob=0.5, spatial_squeeze=True, The scope = 'alexnet_v2) : "" "AlexNet version 2. Described in: http://arxiv.org/pdf/1404.5997v2.pdf the Parameters from: github.com/akrizhevsky/cuda-convnet2/blob/master/layers/ layers-imagenet-1gpu.cfg Note: All the fully_connected layers have been transformed to conv2d layers. To use in classification mode, resize input to 224x224. To use in fully convolutional mode, set spatial_squeeze to false. The LRN layers have been removed and change the initializers from random_normal_initializer to xavier_initializer. Args: inputs: a tensor of size [batch_size, height, width, channels]. num_classes: number of predicted classes. is_training: whether or not the model is being trained. dropout_keep_prob: the probability that activations are kept in the dropout layers during training. spatial_squeeze: whether or not should squeeze the spatial dimensions of the outputs. Useful to remove unnecessary dimensions for classification. scope: Optional scope for the variables. Returns: the last op containing the log predictions and end_points dict. """ with tf.variable_scope(scope, 'alexnet_v2', [inputs]) as sc: end_points_collection = sc.name + '_end_points' # Collect outputs for conv2d, fully_connected and max_pool2d. with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d], outputs_collections=[end_points_collection]): net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1') net = slim.max_pool2d(net, [3, 3], 2, scope='pool1') net = slim.conv2d(net, 192, [5, 5], scope='conv2') net = slim.max_pool2d(net, [3, 3], 2, scope='pool2') net = slim.conv2d(net, 384, [3, 3], scope='conv3') net = slim.conv2d(net, 384, [3, 3], scope='conv4') net = slim.conv2d(net, 256, [3, 3], scope='conv5') net = slim.max_pool2d(net, [3, 3], 2, scope='pool5') # Use conv2d instead of fully_connected layers. with slim.arg_scope([slim.conv2d], Weights_initializer = trunc_normal (0.005), biases_initializer = tf. Constant_initializer (0.1)) : net = slim.conv2d(net, 4096, [5, 5], padding='VALID', scope='fc6') net = slim.dropout(net, dropout_keep_prob, is_training=is_training, scope='dropout6') net = slim.conv2d(net, 4096, [1, 1], scope='fc7') net = slim.dropout(net, dropout_keep_prob, is_training=is_training, scope='dropout7') net0 = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, normalizer_fn=None, biases_initializer=tf.zeros_initializer(), scope='fc8_0') net1 = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, normalizer_fn=None, biases_initializer=tf.zeros_initializer(), scope='fc8_1') net2 = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, normalizer_fn=None, biases_initializer=tf.zeros_initializer(), scope='fc8_2') net3 = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, normalizer_fn=None, biases_initializer=tf.zeros_initializer(), scope='fc8_3') # Convert end_points_collection into a end_point dict. end_points = slim.utils.convert_collection_to_dict(end_points_collection) if spatial_squeeze: net0 = tf.squeeze(net0, [1, 2], name='fc8_0/squeezed') end_points[sc.name + '/fc8_0'] = net0 net1 = tf.squeeze(net1, [1, 2], name='fc8_1/squeezed') end_points[sc.name + '/fc8_1'] = net1 net2 = tf.squeeze(net2, [1, 2], name='fc8_2/squeezed') end_points[sc.name + '/fc8_2'] = net2 net3 = tf.squeeze(net3, [1, 2], name='fc8_3/squeezed') end_points[sc.name + '/fc8_3'] = net3 return net0,net1,net2,net3,end_points alexnet_v2.default_image_size = 224Copy the code

The difference:

The five – or six-character captchas are extended in red

2.3 TFrecord format training data generation

For details about TFrecord files, see the documents related to Tensorflow

TFrecord training data generation code:

#! /usr/bin/env python # -*- coding: utf-8 -*- import tensorflow as tf import os import random import math import sys from PIL import Image import numpy as Np _NUM_TEST = 500 _RANDOM_SEED = 0 MAX_CAPTCHA = 1 TFRECORD_DIR = './TFrecord/' def _dataset_exists(DATASET_DIR)  for split_name in ['train', 'test']: output_filename = os.path.join(dataset_dir, split_name + '.tfrecords') if not tf.gfile.Exists(output_filename): return False return True def _get_filenames_and_classes(dataset_dir): photo_filenames = [] for filename in os.listdir(dataset_dir): path = os.path.join(dataset_dir, filename) photo_filenames.append(path) return photo_filenames def int64_feature(values): if not isinstance(values, (tuple, list)): values = [values] return tf.train.Feature(int64_list=tf.train.Int64List(value=values)) def bytes_feature(values): return tf.train.Feature(bytes_list=tf.train.BytesList(value=[values])) def image_to_tfexample(image_data, label0): # Abstract base class for protocol messages. return tf.train.Example(features=tf.train.Features(feature={ 'image': bytes_feature(image_data), 'label0': int64_feature(label0) })) def char2pos(c): if c == '_': k = 62 return k k = ord(c) - 48 if k > 9: k = ord(c) - 55 if k > 35: k = ord(c) - 61 if k > 61: raise ValueError('No Map') return k def char2pos1(c): if c == '_': k = 36 return k k = ord(c) - 48 if k > 9: k = ord(c) - 55 if k > 35: k = ord(c) - (61 + 26) if k > 36: raise ValueError('No Map') return k def _convert_dataset(split_name, filenames, dataset_dir): assert split_name in ['train', 'test'] with tf.Session() as sess: output_filename = os.path.join(TFRECORD_DIR, split_name + '.tfrecords') with tf.python_io.TFRecordWriter(output_filename) as tfrecord_writer: for i, filename in enumerate(filenames): try: sys.stdout.write('\r>> Converting image %d/%d' % (i + 1, len(filenames))) sys.stdout.flush() image_data = Image.open(filename) image_data = image_data.resize((224, 224)) image_data = np.array(image_data.convert('L')) image_data = image_data.tobytes() labels = filename.split('\\')[-1][0:1] print(labels) num_labels = [] num_labels.append(int(char2pos1(labels))) example = image_to_tfexample(image_data, num_labels[0]) tfrecord_writer.write(example.SerializeToString()) # for j in range(4): # num_labels. Append (int(char2pos1(labels[j]))) # example = image_to_tfExample (image_data, num_labels[0], num_labels[1], num_labels[2], num_labels[3]) # tfrecord_writer.write(example.SerializeToString()) except IOError as e: print('Could not read:', filename) print('Error:', e) print('Skip it\n') sys.stdout.write('\n') sys.stdout.flush() if _dataset_exists(TFRECORD_DIR): print('tfcecord file exists') else: photo_filenames = _get_filenames_and_classes(DATASET_DIR) random.seed(_RANDOM_SEED) random.shuffle(photo_filenames) training_filenames = photo_filenames[_NUM_TEST:] testing_filenames = photo_filenames[:_NUM_TEST] _convert_dataset('train', training_filenames, DATASET_DIR) _convert_dataset('test', testing_filenames, DATASET_DIR) print(' done ')Copy the code

2.4 Model training:

The training code in Figure 1 is as follows:

Code:

#! /usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2019/4/30 10:59 # @Author : shm # @Site : # @File : MyTensorflowTrain.py # @Software: PyCharm import os import tensorflow as tf from PIL import Image from nets import nets_factory import numpy as np # CHAR_SET_LEN = 36 #t Image height IMAGE_HEIGHT = 30 # IMAGE_WIDTH = 16 # BATCH_SIZE = 100 # TFRecord File path TFRECORD_FILE = "./TFrecord/train.tfrecords" # placeholder x = tf.placeholder(tf.float32, [None, 224, L0 = tf.placeholder(tF.placeholder, [None]); l0 = tf.placeholder(tF.placeholder, [None]) Dtype =tf.float32) # read_and_decode(filename): Filename_queue = tf.train.string_input_producer([filename]) reader = tf.tfrecordreader () serialized_example = reader.read(filename_queue) features = tf.parse_single_example(serialized_example, features={ 'image': tf.FixedLenFeature([], tf.string), 'label0': Tf.fixedlenfeature ([], tf.int64)}) # image = tf.decode_raw(features['image'], Shape image = tf.0 shape image = tf.0 (image, [224, 224]) Tf.float32) / 255.0 image = tf.subtract(image, 0.5) image = tF.multiply (image, Cast (features['label0'], tf.int32) return image, Label0 = read_and_decode(TFRECORD_FILE) # shuffle_Batch can be randomly shuffled image_batch label_batch0 = tf.train.shuffle_batch( [image, label0], batch_size=BATCH_SIZE, capacity=50000, min_after_dequeue=10000, Train_network_fn = nets_factory.get_network_fn('alexnet_v2', num_classes=CHAR_SET_LEN, Weight_decay =0.0005, is_training=True) with tf.session () as sess: # inputs: a tensor of size [batch_size, height, width, channels] X = tf.reshape(x, [BATCH_SIZE, 224, 224, End_points = train_network_fn(X) # change the tag to one_hot_labels0 = tf.one_hot(indices=tf.cast(y0, tf.int32), Depth =CHAR_SET_LEN) # Calculate loss loss0 = tf.reduce_mean(tf.nn.softMAX_cross_entropy_with_logits (logits=logits0, Labels = one_HOT_labels0)) # Calculate total loss Total_loss = (loss0) # optimize total_loss optimizer = Tf.train.AdamOptimizer(learning_rate= LR). Minimize (total_loss) # calculate correct_prediction0 = tf.equal(tf.argmax(one_hot_labels0, 1), tf.argmax(logits0, 1)) accuracy0 = tf.reduce_mean(tf.cast(correct_prediction0, Tf.float32)) # Use to save the model saver = tf.train.saver () # initialize sess.run(tf.global_variabLES_initializer ()) # create a coordinator, Threads = tf.train.start_queue_runners(sess=sess, coord=coord) for i in range(60001): Sess.run ([image_batch, label_batch0]) # feed_dict={x: B_image, y0: b_label0}) # Calculate the loss and accuracy if I % 20 == 0: if I % 2000 == 0: sess.run(tf.assign(lr, lr / 3)) acc0, loss_ = sess.run([accuracy0, total_loss],feed_dict={x: b_image,y0: b_label0}) learning_rate = sess.run(lr) print("Iter:%d Loss:%.3f Accuracy:%.2f Learning_rate:%.4f" % (i, loss_, acc0, Learning_rate) # save model if acc0 > 0.99: saver.save(sess, "./models/crack_captcha_model", global_step=i) break if i == 60000: Saver.save (sess, "./models/crack_captcha_model", global_step= I) break To return coord.join(threads)Copy the code

2.5 Model recognition rate test code:

#coding=utf-8 import os import tensorflow as tf from PIL import Image from nets import nets_factory import numpy as np import matplotlib.pyplot as plt CHAR_SET_LEN = 36 IMAGE_HEIGHT = 30 IMAGE_WIDTH =16 BATCH_SIZE = 1 TFRECORD_FILE = "./TFrecord/test.tfrecords" # placeholder x = tf.placeholder(tf.float32, [None, 224, 224]) def read_and_decode(filename): filename_queue = tf.train.string_input_producer([filename]) reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example(serialized_example, features={ 'image' : tf.FixedLenFeature([], tf.string), 'label0': tf.FixedLenFeature([], tf.int64), }) image = tf.decode_raw(features['image'], tf.uint8) image_raw = tf.reshape(image, [224, 224]) # image = tf.reshape(image, [224, 224]) # image = tf.cast(image, Tf.float32) / 255.0 image = tf.subtract(image, 0.5) image = tF.multiply (image, 2.0) # label0 = tf.cast(features['label0'], tf.int32) return image, label0 image, image_raw, label0 = read_and_decode(TFRECORD_FILE) # image_batch, image_raw_batch, label_batch0 = tf.train.shuffle_batch([image, image_raw, label0], batch_size = BATCH_SIZE,capacity = 50000, min_after_dequeue=10000, Num_threads train_network_fn = = 1) nets_factory. Get_network_fn (' alexnet_v2 num_classes = CHAR_SET_LEN, weight_decay = 0.0005,  is_training=False) with tf.Session() as sess: # inputs: a tensor of size [batch_size, height, width, channels] X = tf.reshape(x, [BATCH_SIZE, 224, 224, 1]) # logits0,end_points = train_network_fn(X) # predict0 = tf.reshape(logits0, [-1, CHAR_SET_LEN]) predict0 = tf.argmax(predict0, 1) # sess.run(tf.global_variables_initializer()) sess.run(tf.local_variables_initializer()) # saver = tf.train.Saver() saver.restore(sess, './models/crack_captcha_model-1080') # coord = tf.train.Coordinator() # threads = tf.train.start_queue_runners(sess=sess, coord=coord) count = 0 for i in range(500): # try: b_image, b_image_raw, b_label0 = sess.run([image_batch,image_raw_batch, label_batch0]) except Exception as e: Print (e) = image.fromarray (b_image_raw[0],'L') print('label:',b_label0) feed_dict={x: b_image}) print('predict:',label0) if b_label0[0] == label0[0]: count = count + 1 print(count) # coord.request_stop() # coord.join(threads)Copy the code

Section 3: API interface development

Code directly loaded two model files, a unified interface through the module parameter to pass different values, you can call different models to identify different verification codes

3.1 Verification code Recognition service based on Flask:

API service code

#! /usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2019/10/14 10:25 # @Author : shm # @Site : # @File : YZM_Service.py # @Software: PyCharm from flask import Flask, request, render_template import tensorflow as tf from PIL import Image from nets import nets_factory import numpy as np import Base64 from IO import BytesIO def num2char(num): param num: :return: "if num < 10: return (num + ord('0')) elif num < 36: return (num - 10 + ord('a')) elif num == 36: return (ord('_')) else: Raise ValueError('Error') def splitimage(img, rownum, colnum): param img: :param Rownum: :param colnum: :return: ''' w, h = img.size if rownum <= h and colnum <= w: rowheight = h // rownum colwidth = w // colnum r = 0 imlist = [] for c in range(colnum): box = (c * colwidth, r * rowheight, (c + 1) * colwidth, (r + 1) * rowheight) imlist. Append (im. crop(box)) return imlist def ImageReshap(img): "" :return: ''' image_data = img.resize((224, 224)) image_data = np.array(image_data.convert('L')) return image_data class LoadModel_v1: Def __init__(self,model_path,char_set_len=36): "" :param model_path: model file path: param char_set_len: ''' self.char_set_len = char_set_len g = tf.Graph() with g.as_default(): self.sess = tf.Session(graph=g) self.graph = self.build_graph() BATCH_SIZE = 1 self.x = tf.placeholder(tf.float32, [None, 224, 224]) self.img = tf.placeholder(tf.float32, None) image_data1 = tf.cast(self.img, Tf.float32) / 255.0 Image_data2 = tf.subtract(Image_data1, 0.5) Image_data3 = tF.multiply (Image_data2, 0) self.image_Batch = tf.0 (Image_data3, [1, 224, 224]) X = tf.0 (self.x, [BATCH_SIZE, 224, 224, 0) 1]) self.logits0, self.end_points = self.graph(X) self.sess.run(tf.global_variables_initializer()) saver = tf.train.Saver() Saver.restore (self.sess,model_path) def build_graph(self): ''' train_network_fn = Nets_factory. Get_network_fn (' alexnet_v2 'num_classes = self. Char_set_len weight_decay = 0.0005, is_training = False) return Train_network_fn def recognize(self,image): "" return:" " inputdata = self.sess.run(self.image_batch, feed_dict={self.img: image}) predict0 = tf.reshape(self.logits0, [-1, self.char_set_len]) predict0 = tf.argmax(predict0, 1) label = self.sess.run(predict0, feed_dict={self.x: inputdata}) text = chr(num2char(label)) return text except Exception as e: "Def screen_shot(self,img): :return: def screen_shot(self,img): :return:" box = (36, 0, 100, 30) return img.crop(box) except Exception as e: print("screenshot:", e) return None def img_to_text(self,imgdata): Yzmstr = "" with BytesIO() as iofile: iofile.write(imgdata) with Image.open(iofile) as img: img = self.screen_shot(img) imglist = splitimage(img, 1, 4) text = [] for im in imglist: imgreshap = ImageReshap(im) yzmstr = self.recognize(imgreshap) text.append(yzmstr) yzmstr = "".join(text) return yzmstr class LoadModel_v2(LoadModel_v1): def __init__(self,model_path): super(LoadModel_v2, self).__init__(model_path) def img_to_text(self,imgdata): yzmstr = "" with BytesIO() as iofile: iofile.write(imgdata) with Image.open(iofile) as img: imglist = splitimage(img, 1, 4) text = [] for im in imglist: imgreshap = ImageReshap(im) yzmstr = self.recognize(imgreshap) text.append(yzmstr) print(yzmstr) yzmstr = "".join(text) return yzmstr app = Flask(__name__) @app.route('/') def index(): return render_template('index.html') @app.route('/Recognition',methods=['POST']) def recognition(): try: imgdata = request.form.get('imgdata') module = request.form.get("module","") if module == "v1": decodeData = base64.b64decode(imgdata) yzmstr = loadModel_model1.img_to_text(decodeData) return yzmstr elif module == "v2": decodeData = base64.b64decode(imgdata) yzmstr = loadModel_model2.img_to_text(decodeData) return yzmstr else: return "unkonw channel" except Exception as e: return repr(e) if __name__ == "__main__": LoadModel_model1 = LoadModel_v1("./models/crack_captcha_model-1080" LoadModel_v2 (". / models/crack_captcha. Model - 2140 ") app. The run (host = '0.0.0.0', the port = 2002, debug = True)Copy the code

3.2 Interface call code

#! /usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2019/5/6 18:46 # @Author : shm # Site : # @File : test.py # @Software: PyCharm import base64 import requests the import OS # recognition API interface url = "http://127.0.0.1:2002/Recognition" # test path is stored in the authentication code path = #model = "v2" imglist = os.listdir(path) count = 0 nums = len(imglist) for file in  imglist: try: dir = path + "\\" + file with open(dir,"rb") as fp: database64 = base64.b64encode(fp.read()) form = { 'module':model, 'imgdata': database64 } r = requests.post(url, data=form) res = r.text yuan = file[0:4] if yuan.lower() == res: count = count + 1 print("Success") else: print(file[0:4],"==",res) except Exception as e: Print (e) print (" -- -- -- -- -- a total of % s platform: % s -- -- -- -- -- correct recognition: % s "% (model, nums, count))Copy the code

conclusion

This paper mainly introduces how to simulate the generation of verification code training sample data and how to slice verification code for identification. Subsequent articles will be realized in this article, based on the image segmentation overall recognition model training, and variable length verification code recognition technology scheme implementation, based on the deep learning captcha general recognition model solution project related code will be synchronized to the git, late late will add address in, first wrote here today in the article have insufficient place, and any questions, Welcome to add QQ:1071830794 for exchange and discussion, welcome to learn and grow together.

Do not use for commercial purposes

Thank you for reading