A brief introduction of handwritten number recognition technology
As a branch of image recognition, handwritten digital recognition is one of the important applications in the field of image processing and pattern recognition, and has strong universality. Handwritten digit recognition is a challenging task because of the great randomness of handwritten digits, such as stroke thickness, font size, tilt Angle and other factors may directly affect the recognition accuracy of characters. In the past decades, many methods have been proposed and some achievements have been achieved. Handwritten number recognition is very practical and has a broad application prospect in large-scale data statistics, such as routine annual inspection, census, finance, taxation, mail sorting and other applications. This case describes the recognition process of handwritten Arabic numerals in images. The method of handwritten numerals recognition based on statistics is briefly introduced and analyzed. The experiment is carried out by developing a small handwritten numerals recognition system. Handwritten digital recognition system needs to realize the functions of handwritten digital image reading, feature extraction, the establishment of digital template feature database and recognition.
2. BP algorithm and implementation process 2.1 Basic principle of BP algorithmKnown input vectors and corresponding output vectors (expected outputs) are used as training samples, and the network to be learned is assumed to have been assigned a set of weights. To eliminate the negative impact of the gradient magnitude using elastic back propagation algorithm through these steps to update the weights (figure 1) : first of all, the right to use the initial value (whether correctly or not) from the input layer started forward transmission, calculate all the output neurons, so that the output of the output layer and the desired output (i.e., the output value and target value) between the larger error. Then, the gradient of] error function (loss function or objective function or cost function), which is the weight function of neuron, is calculated, and the update weight is adjusted according to the direction in which the error decreases fastest, and the error function is constantly adjusted by propagating the output error back to the hidden layer. At the same time of calculating error gradient, the same method as above is used to update the weight of hidden layer. Iterate until the loss function reaches the desired goal. In the learning process of elastic back propagation algorithm, the modified value of weight is the learning rate, and the gradient only affects the direction of weight change, namely positive and negative.FIG. 1 Back propagation neural network model
1.2 Perceptron neural networkMultilayer Perceptron (MLP) neural network is a simple binary classification artificial network for pattern recognition. It simulates synapses of nerve cells by weight and cell body by activation function, and bias is the threshold. The single-layer perceptron network structure is shown in Figure 2. The single-layer perceptron can divide the external input X into two categories: when the output y of the perceptron is positive or zero, the input belongs to the first category; When the perceptron output is negative, the input falls into the second category. 1.3 Realization process (1) Image readingIn this paper, a sample database was designed and self-built. There are 5000 different handwritten digital images with 10 Arabic digits from 0 to 9 in the database. They are BMP files with white background and black background, and each number corresponds to 500 pictures. 450 handwritten images were randomly selected from each number as training samples, and the remaining 50 from each number as test samples. A partial digital sample is shown in Figure 3.Figure 3. Digital proofs(2) Feature extractionThe number of training samples in this design is large, and the number of neurons in the input layer of the general neural network is the dimension of the training sample vector, so it is necessary to reduce the dimension of the training sample vector. The pre-processing process is to transform the image into binary image by gray threshold function. Before dimensionality reduction, all images should be scaled to ensure that the input vectors of each image have the same pixels. In this design, the height and width of the selected image zoom are 70 pixels and 50 pixels respectively, in line with the height to width ratio of general handwritten Arabic numerals. As shown in FIG. 4, each 10×10 pixel point is used as a series of pixel blocks to form a binary image containing 35 pixel blocks. The ratio of 0 and 1 in each pixel block is calculated and used as an eigenvalue of the mode, so as to form a 5×7 eigenvalue matrix. Considering that the input vector of the perceptron Shenshen network can only be one-dimensional, it is necessary to convert this matrix into one-dimensional vector as the input of the training sample. After transpose, a total of 35 one-dimensional vectors are generated.FIG. 4 Image cutting after scaling
Both training samples and test samples need to construct labels. The former is used for learning mapping, while the latter is used to judge the accuracy of training network. Generally, the number of neurons in the output layer is the number of classification categories in the classification network. Arabic numerals are of 10 types, so the number of output neurons is 10. Each class consists of a specific 500 images, including training samples and test samples. By extracting features, each class generates 35 one-dimensional vectors: 500 column vectors (1000000000)T are used to label mode 1, i.e. number 1; (0100000000) Annotation mode 2, that is, number 2; (0010000000) Annotation mode 3, that is, number 3; And so on, the last (0000000001) is labeled mode 0, the number 0. The running code is as follows:
(4) Randomly select training samples and test samples. The test uses the existing RAND () pseudo-random number generating function in MATLAB to generate 5000 pseudo-random numbers between 0 and 1. The generated pseudo-random numbers are sorted in ascending order, the original positions of random numbers are recorded by index, and the original positions are combined into new row vectors. In this design, there are 35 neurons in the input layer and 10 neurons in the output layer, and 25 is selected as the number of neurons in the middle hidden layer.
(5) The calculation of digital recognition and accuracy compares the tag before the test and the output after simulation. The tag value before the test subtracts the output value to get the error value, and the error 0 is regarded as the correct recognition, and the accuracy rate of the neural network is calculated. The specific running code is as follows:
Two, some source code
clc; clear all; close all; warning off all; Fd = fullfile(PWD,'images'.'dbx');
fds = dir(fd);
ts = [];
for i = 1 : length(fds)
if isequal(fds(i).name, '. ') || isequal(fds(i).name, '.. ')
continue;
end
ts{end+1} = fds(i).name; end files = GetAllFiles(fd); Db_file = fullfile(PWD,'VL.mat');
if exist(db_file, 'file')
load(db_file);
else
VT = [];
LT = [];
for i = 1: length(files) im = imread(files{i}); [~, v] = get_feature(im); VT = [VT v]; [pn, ~, ~] = fileparts(files{i}); [~, nm, ~] = fileparts(pn);for j = 1 : length(ts)
ifIsequal (ts{j}, nm) % LT = [LT j];break;
end
end
end
save(db_file, 'VT'.'LT'); End % BP training net_file = fullfile(PWD,'bp_net.mat');
if exist(net_file, 'file')
load(net_file);
else
p_train=VT;
t_train=LT;
[pn,minp,maxp,tn,mint,maxt] = premnmx(p_train, t_train);
threshold=minmax(pn);
net=newff(threshold,[30.20.10.1] and {'tansig'.'tansig'.'tansig'.'purelin'},'trainlm');
net.trainParam.epochs=10000;
net.trainParam.goal=1e-5;
net.trainParam.show=50;
net.trainParam.lr=0.01; net=train(net,pn,tn); % storage network save(net_file,'net'.'minp'.'maxp'.'mint'.'maxt');
end
m=1; % The image brightness factor has a great influence on the recognition success rate. If the image itself has a high contrast (yours has a high contrast), set it to1Can be, if the light is low can be appropriately increased value and vice versa. % is similar to the pictures in my folder, m needs to be set to0.5~0.3[fn,pn,fi]= uigetFile ('*.jpg'.'Select picture'); % select image I=imread([pn fn]); figure(1),imshow(I); title('Original image'); %J=imadjust(I,[0.2 0.6], [0 1],m); figure(2),imshow(J); title('Grayscale image'); Grayimg = I; BWimg = grayimg; [width,height]=size(grayimg); % grayscale data transfer, import width and height % thRESH = Graythresh (I); % Automatically determine binarization threshold; A=im2bw(I,0.6); % thresh=0.5Represents the gray level in128All the pixels below are turned black, and the gray level is set to128All the above pixels turn white. figure(3); imshow(A); title('Binary image'); % display image bw = edge(A,'sobel'.'vertical'); figure(4); imshow(bw); title('Edge image'); % Z = strel('rectangle'[30.18]); Bw_close =imclose(bw,Z); figure(5); imshow(bw_close); title('Close operation'); %bw_open = imopen(bw,Z); figure(6); imshow(bw_open ); title("Open operation"); ShowImg = grayimg; % grayscale data transfer % image data binarization processing, if the image scale is too large to increase the processing time geometric times, so try to use small scale pictures, (your picture is completely OK) I provide a test picture is too largefor i=1:width
for j=1:height
if(BWimg(i,j) == 255)
showImg(i,j)= grayimg(i,j);
else
showImg(i,j)= 0;
end
end
end
figure(6); imshow(showImg); % visual display binary image [l,m] = bwLabel (bw_CLOSE); % subclass tag status=regionprops(l,'BoundingBox'); % centroID = regionprops(l,'Centroid'); % character area measure imshow(I); hold on; a=[7 -.7 -.7.7];
if m>1
for i=1:m
if status(i).BoundingBox(1.3) > status(i).BoundingBox(1.4% normalized preprocessing status(I).boundingbox (1.4)=status(i).BoundingBox(1.3);
else
status(i).BoundingBox(1.3)=status(i).BoundingBox(1.4);
end
status(i).BoundingBox=status(i).BoundingBox+a;
rectangle('position',status(i).BoundingBox,'edgecolor'.'g'); % character frame text(centroid(I,1).Centroid(1.1)- 25,centroid(i,1).Centroid(1.2)- 25, num2str(i),'Color'.'r') % character box label endfor i=1:m cropimg_2 = imcrop(A,status(i).BoundingBox); Cropimg_2 = imresize(cropimg_2,[28.28]); % adjust the image to28*28The % of cropimg_2 = imcomplement (cropimg_2); [file_path,~,~]= fileparts(mfilename('fullpath'));
disp(file_path)
imwrite(cropimg_2,[file_path,'\\LXJC\\',num2str(m,'%02d'),'.bmp'].'bmp'FilePath = [file_path, file_path,'\\LXJC\\',num2str(m,'%02d'),'.bmp']; % Open imageif isequal(filePath, 0)
break; end [~, p_test] = get_feature(im); p2n = tramnmx(p_test,minp, maxp); r=sim(net,p2n); r2n = postmnmx(r,mint,maxt); r = ts{r}; figure; imshow(cropimg_2); title(r,'FontSize'.16);
end
elseEnd function [BWZ, p] = get_feature(im) % = im2bw(im, Graythresh (im)); bw = ~bw; [r, c] = find(bw); rect = [min(c)- 1 min(r)- 1 max(c)-min(c)+2 max(r)-min(r)+2];
bwt = imcrop(bw, rect);
rate = 70/size(bwt, 1);
rc = round(size(bwt)*rate);
bwt = imresize(bwt, rc, 'bilinear');
if size(bwt, 2) < 50
bwz = zeros(70.50);
ss = round((size(bwz, 2)-size(bwt,2)) *0.5);
tt = round((size(bwz, 1)-size(bwt,1)) *0.5);
bwz(:, ss:ss+size(bwt,2)- 1) = bwt;
else
bwz = imresize(bwt, [70 50].'bilinear');
end
bwz = logical(bwz);
for k=1:7
for k2=1:5
dt=sum(bwz(((k*109 -):(k*10)),((k2*109 -):(k2*10))));
f((k- 1) *5+k2)=sum(dt);
end
end
f=((100-f)/100); p = f(:); Function filePath = OpenImageFile(imgfilePath) % open file % output parameter: % filePath -- filePathif nargin < 1
imgfilePath = fullfile(pwd, 'images/testx / Ⅰ - 003. BMP'); End % Read file [filename, pathName, ~] = uigetFile (... {'*.bmp; *.jpg; *.tif; *.png; *.gif'.'All Image Files'; .'*. *'.'All files (*.*)'},...'Select file'.'MultiSelect'.'off'. imgfilePath); filePath =0;
return;
end
filePath = fullfile(pathname, filename);
Copy the code
3. Operation results
Matlab version and references
1 matlab version 2014A
2 Reference [1] CAI Limei. MATLAB Image Processing — Theory, Algorithm and Case Analysis [M]. Tsinghua University Press, 2020. [2] Yang Dan, ZHAO Haibin, LONG Zhe. Examples of MATLAB Image Processing In detail [M]. Tsinghua University Press, 2013. [3] Zhou Pin. MATLAB Image Processing and Graphical User Interface Design [M]. Tsinghua University Press, 2013. [4] LIU Chenglong. Proficient in MATLAB Image Processing [M]. Tsinghua University Press, 2015.