A brief introduction to handwriting capital letter recognition technology
In the process of teaching in colleges and universities, examination is the most common method of teaching evaluation and comprehensive practice. With the progress of science and technology, the way of marking examination papers has also undergone great changes. The traditional marking method is mainly manual marking, which has low efficiency and so on. The modern marking method uses Optica Mark Recognition (OMR) technology. Candidates only need to fill in the answer sheet, and the computer will process the answer sheet to achieve automatic marking, but this way needs to use a specially designed answer sheet and pencil, and follow certain filling specifications. The above two methods have brought some limitations to teachers and candidates, and cost more time; Therefore, in order to make the subjects flexibly and efficiently answer the questions, it is necessary to study the character recognition of handwriting. Handwritten character recognition is generally divided into two categories according to the technical route taken: one is feature extraction; The other is neural network. Feature extraction method according to the characteristics of extracting characters and different can be divided into two kinds: one kind is character recognition algorithm based on contour, structure characteristics, the outline of this algorithm by identifying character image features, endpoint, concave-convex structure characteristics, etc., to refine the character, with the method of template matching to realize automatic recognition of handwritten characters. Although this kind of method can intuitively describe the structure of characters, it can not recognize the noise and deformation characters well and lacks robustness. The other is the handwritten character recognition algorithm based on statistical features, which trains corresponding classifiers on a large number of samples and uses these classifiers to classify the recognized characters. The advantage is that when the sample is sufficient, this kind of method can have better identification effect. The disadvantage is that it is often not possible to construct sufficiently rich samples. The neural network method can get a higher recognition rate, but it needs to train a large number of samples, which takes more time, and reduces the real-time recognition. Even if the neural network is improved and optimized, its deficiency is not made up fundamentally.
In the examination, due to the different writing styles of the subjects, the characteristics of handwriting characters are different, which determines that a single scheme will not get a good recognition effect in handwriting characters recognition. In the evaluation of objective questions of examination papers, most of them only contain A, B, C, D four characters, the number of characters is small, only A to D four characters recognition can get better marking efficiency and high correct recognition rate. According to the characteristics and application scenarios of handwritten English letters, this paper proposes a handwritten English letter recognition method based on combined features. This method adds shape feature extraction on the basis of contour feature extraction, which is simple to extract feature information and does not need sample training, so the recognition success rate and recognition speed of handwritten English letters are improved.
1 image preprocessing Because of the characteristic of the handwritten character shape is different, and the degree of illumination change, paper neat image when shooting will make character images such as there is a certain degree of noise, therefore must carry on the pretreatment of image, so easy on the back of the letter segmentation and feature extraction, to improve the final letter recognition accuracy. Image preprocessing includes the use of Gaussian filter to remove the gaussian noise generated by the camera; Binarization of the image makes the image remove the background; Finally, contour extraction was used to remove large noise points.
1.1 Gaussian filter digital image may be polluted by noise in the process of acquisition and transmission, and the internal components of the camera sensor are most likely to produce gaussian noise. Gaussian noise refers to a class of noise whose probability density function obeys gaussian distribution. The common method for this kind of noise is Gaussian filtering. Gaussian filter is a linear smoothing filter, suitable for eliminating Gaussian noise, which is widely used in the noise reduction process of image processing. According to the test samples, 1×1, 3×3 and 5×5 Gaussian kernel were respectively used for the experiment, and it was found that samples using 3×3 Gaussian kernel performed best in the experiment. Therefore, 3*3 Gaussian kernel was selected to check the original image for Gaussian filtering in this paper.
1.2 Image binarizationBinary image processing is to use the handwriting character region and its background there is a gray difference, the handwriting character and the background to gray area to separate, facilitate the later processing. The commonly used binarization algorithms include the maximum between-class variance method (OTSU method), iterative method, Bers en, Niblack, simple statistical method, Niblack and Fuzzy C-means (NFC M), etc. OTSU method is an adaptive threshold determination method. Its basic idea is to obtain the optimal threshold threshold, which can divide the image gray histogram into black and white parts to maximize the inter-class variance and minimize the intra-class variance. Therefore, OTSU method is also called the maximum inter-class variance method. In this paper, the handwritten English letters in the image have obvious gray difference with the background, so the OTSU method is suitable. Figure 1 shows the image binarization process.FIG. 1 Image binarization
1.3 Contour Denoising After gaussian filtering, part of the noise caused by camera sensors can be removed or reduced, but the processing effect of the noise existing in the paper itself cannot meet the requirements. In the process of practical application, the background region except letters may be unevenly distributed, irregular noise, therefore this article adopts the way of finding connected domain profile, find the outline of the connected domain, and calculate their contour area, set a threshold value area, contour area is less than the threshold value of the connected domain can be judged as noise, to eliminate it. After a large number of experiments, this paper is set to remove pixels whose pixel area is less than 12. The specific contour extraction method is described in detail in the following article. Figure 2 is the image after contour denoising.
2-character segmentationImage projection feature is one of the important technologies in image processing. Generally, the total number of pixels (black or white) in each position of the X or Y axis of the image is calculated, and the corresponding projection curve is drawn to analyze the image features. This technology is often used in image segmentation, character detection and extraction. Formulas (1) and (2) are the calculation formulas for horizontal projection and vertical projection respectively. Where, h and W are the height and width of the binary image respectively; F (I, j) is the value 1 or 0 of the element in row I and column J of the image.2.1 line segmentationLine segmentation is to carry out horizontal projection on the X-axis direction of the image to obtain the total number of white points at each position along the X-axis direction. If the total number of white points at a certain position is 0, it means that there is no handwriting trace at this position. Because people’s common writing habit is to write line by line, and there is a certain distance between the lines, so through line segmentation can know which position is not written, so that the handwriting area can be roughly divided according to line.Figure 3 Row splitting process
2.2 column splitColumn segmentation is to carry out vertical projection on the Y-axis direction of the image to obtain the total number of white dots at each position along the Y-axis direction. If the total number of white dots at a certain position is 0, it indicates that the position is the gap between letters. Each letter is easily extracted for subsequent recognition through column segmentation.Figure 4 column splitting process
2.3 Minimum image segmentationAfter on line segmentation with column integral, basic letters split into separate images, but the image size is larger, often as a proportion of the letters in the entire image is small, in order to reduce the processing time, convenient and at the back of the letter recognition, need to break up letter again after cutting, make the letters to fill the whole picture. The black border in Figure 5 is the image boundary.FIG. 5 Schematic diagram of minimum image
3 classification recognition It is easy to find the contour features of letters A to D. Letters A and D contain A closed curve. The letter B contains two closed curves; Letter C contains 0 closed curves, so the letters can be separated by extracting the number of closed curves of the letters, and then distinguish A and D. After the contour feature is extracted, the shape feature is extracted, and the recognition results of the two features are fused to get the final recognition result
3.1 Contour feature extractionContour extraction is one of the common methods in image processing. Contour can be simply considered as a curve connecting continuous points with the same color or gray level. Through the topological analysis of binary image, the image is scanned, and different integer values are assigned to different contours, so as to determine the outer and inner contours and their hierarchical relationship. In Figure 6, there are two black contours. 1a is the outer boundary of the first black contour, 1B is the hole boundary of the first black contour, 2A is the outer boundary of the second black contour, and 2B is the hole boundary of the second black contour.The specific operation in FIG. 6 is to first scan the rows of the binary image to judge the pixel values of the pixel points. Formula (3) is used to represent the pixel values of the image, where I is the number of rows and j is the number of columns of the image.When the line scans the outer boundary of the contour and the hole boundary, the scan is terminated. Equation (4) is the termination condition of the outer boundary, and Equation (5) is the termination condition of the hole boundary. Take f(I, j) at the end of the scan as the starting point, mark the pixels on the boundary, assign a value (such as 1) to the boundary, and then add 1 to the value every time a new boundary is found. In FIG. 6, there are two contours, the mark value of the first contour is 1, and the mark value of the second contour is 2. In this way, contour detection is completed.In this paper, A~D can be easily divided into three categories by extracting the contours of segmtioned handwritten English letters: C with 0 contours; A and D with contour number 1; We have B for contour number 2. Since letters A and D have the same number of Outlines, it is necessary to distinguish the two letters. By analyzing the profile characteristics of letters A and D, it can be found that for letters A and D of the same size, the profile area of letter A is smaller than that of letter D. For the letter A and letter D, which are the smallest image, the ratio of the contour area of letter A to the image area is 0.075194884, and the ratio of the contour area of letter D to the image area is 0.321412412 after statistical analysis of 182 handwritten English letter data. It can be seen that the proportion of outline area of letter A to size area is smaller than that of letter D. Therefore, in wang recognition, we only need to calculate the difference between the area proportion of the identified letter outline and the proportion coefficient of letter A and letter D, and then distinguish A and D by comparison. Figure 7 is the sketch of extracting contour area of letters A and D.Figure 7. Outline area extraction of letters A and D
3.2 Extraction of shape featuresBy analyzing the shape of English letters A~D, the shape features of each letter can be found. The specific operations are as follows: Fill the pre-processed binary image (Figure 8(A)) to obtain Figure 8(b). It can be seen that the shape of letter A is approximately triangular; The shape of the letter B consists of two raised parts and one recessed part; The letter C also contains two raised parts and a recessed part; The letter D has only a raised part. Then, figure 8(b) was horizontally projected and the pixels of each letter line were displayed on the projection histogram (Figure 8(c)). After analyzing the pixels and data of each letter, it was found that the data curve of letter A showed A monotonous increase. The data curve of letter B contains two peaks and one trough; The data curve of letter C also contains two peaks and one trough; The data curve of the letter D contains 1 crest.Figure 8 Shape feature Extraction Process After obtaining the data of each letter, the difference operation is performed on the data curve. The peak point of the curve satisfies that the first derivative is 0 and the second derivative is negative; The first derivative of the trough point is 0 and the second derivative is positive; Monotonic curves satisfy that the first derivative is not zero, and letters can be distinguished into A, B and C and D. The algorithm flow is as follows: The data curve can be regarded as a one-dimensional vector as follows:Where, vi (I ∈[1, 2… N]) represents the sum of white pixels on line I of the image. Calculate the first-order difference vector Diffv of V:After the difference vector is obtained, the symbolic function operation is performed, as shown in Equation (8) :At this point, the change of each point can be obtained. However, some points with the same value are not peaks and troughs, so the peaks and troughs of the data curve can be obtained by making the first-order difference again.
3.3 Special HandlingBecause in the examination inevitably appears the answer fills in the wrong case, therefore needs to make the special treatment for this kind of situation. Based on character recognition on the basis of designing a wrong answer labelling, to fill out the wrong answer on the answer from left to right row 2 pen not repeated slashes, from right to left row 2 pen not repeated slashes, as shown in figure 9, after marking the wrong answer for the outline of the outline will higher than normal letters, so you can distinguish between fill out the wrong answer.Figure 9 Special case processing
Part of the source code
Function [output_args] = main(input_args) %2s
clear all
warning off
clc
fid = fopen('text.txt'.'wt'); ImgIn = imread('test2.bmp');
figure();set(gcf,'Name'.'Pre-processing'); subplot(1.2.1); imshow(imgIn); title('original');
% imgIn = imnoise(imgIn,'salt & pepper'.0.03); % imgIn = imnoise(imgIn,'gaussian'.0.0.03); % imwrite(imgIn,'out'.'bmp');
imgGr = imgIn(:,:,3); ImgMed = medfilt2(imgGr); % Use median filtering to remove salt and pepper noise imgMed = medFilT2 (imgMed); % Continue to use median filtering to make the image area more obvious imgBw = IM2BW (imgMed); ImgBw = ones(size(imgBw)) -imGBw; ImgBw = imclose(imgBw,strel('disk'.3)); ImgBw (:,1) =0; imgBw(:,size(imgBw,2=))0; imgBw(1, :) =0; imgBw(size(imgBw,1), :) =0; % Set the pixels in the outermost week of the image to0, median filtering and closed operation can not process this position subplot(1.2.2); imshow(imgBw); title('Processed binaries'); %% segmentation image and recognize each character %loca is a matrix [5,:], used to record the recognition result and the position of the character figure;set(gcf,'Name'.'recognition');
[lines,cols]=size(imgBw);
loca = [];
m=0;
for i=1:lines
for j=1:cols
if imgBw(i,j)==1imgCut = areaGrow(imgBw,[i,j]); [x,y]=find(imgCut==1); imgBw = imgBw - imgCut; ImgSl = imgSlim(imgCut); % imgOP = imresize(imgSl,[42.24]); % Scale the pulled image to [42 24] charNum = identify(imgOP); % for character recognition M = M +1; subplot(4.4,m); imshow(imgSl); xlabel(['Identify as:' char(charNum)]); Loca = [loca [charNum;min(x); Max (x);min(y); Max (y)]]; N = size(loca) n = size(loca) n = size(loca) n = size(loca) n = size(loca)2); % Calculates the number of loca columns (that is, the number of characters) suma =0; %suma is used to record the height sum of all charactersfor x=1:n
suma = suma + loca(3,x)-loca(2,x); end [lines,cols]=size(imgBW); ImgCut =zeros(lines,cols); % Build one the same size as imgOut0ImgCut (p(1),p(2=))1; % set point (p(1),p(2) is the seed point count=1; % Record around the growing point8The number of white pixels in pixels is the judgment condition for the end of the following loopwhile count>0
count=0;
for i=1:lines
for j=1:cols
if imgCut(i,j)==1% Detects white in imgCutif (i- 1) >0&&(i+1)<(lines+1)&&(j- 1) >0&&(j+1)<(cols+1% is not a boundary pointfor u=- 1:1
for v=- 1:1
if imgCut(i+u,j+v)==0&& imgBW(i+u,j+v)==1ImgCut (I +u,j+v)=1; % add this point to imgCut count = count+1;
end
end
end
end
end
end
end
end
end
Copy the code
Third, the operation result
Iv. Matlab version and references
1 Matlab version 2014A
2 References [1] CAI Limei. MATLAB Image Processing — Theory, Algorithm and Case Analysis [M]. Tsinghua University Press, 2020. [2] Yang Dan, ZHAO Haibin, LONG Zhe. Detailed Analysis of MATLAB Image Processing Examples [M]. Tsinghua University Press, 2013. [3] Zhou Pin. MATLAB Image Processing and Graphic User Interface Design [M]. Tsinghua University Press, 2013. [4] LIU Chenglong. Proficient in MATLAB image processing [M]. Tsinghua University Press, 2015.