Author: Wang Qian
The background,
Receipt printing is the basic function of retail merchants, in the receipt information, there will be some information about the store. For example, logo, store QR code and so on. For businesses, logo and store TWO-DIMENSIONAL code upload, are basically color, but the receipt printer is basically only support black and white binary graph printing. For the service experience of merchants, we do not have requirements on the pictures uploaded by merchants. Merchants can upload their own personalized pictures according to the actual situation. Therefore, we need to print the pictures of merchants after binary graph processing.
This article is a description of the solution for the binary graph processing part of the image in the “Good retail Receipt printing cross-platform solution”.
Second, image binarization processing process
Image binarization is to set the gray value of the pixels on the image (if it is RGB color map, the RGB value of the pixels should be converted to gray value first) to 0 or 255, which is the process of presenting an obvious black and white effect on the whole image.
The intermediate threshold T, which divides 0 and 255, is the core of binarization. An accurate threshold can obtain a better binarization graph.
Binarization overall flow chart:
As can be seen from the above flow chart, obtaining grayscale image and calculating threshold T are the core steps of binarization.
3. Previous solutions
The previous scheme is to first process the image into grayscale image, and then calculate the threshold T of dividing 0 and 255 based on OTSU (OTSU method, maximum interclass variance method) algorithm, and then perform binarization processing on the gray value according to T to get the binary image.
All of our algorithms are implemented in C language for cross-platform versatility.
Flow chart:
Gray scale algorithm:
For RGB color to grayscale, there is a well-known formula:
The algorithm is called Luminosity, or brightness. Currently this algorithm is the most commonly used, in which the three data are empirical or experimental values. See Wiki for the origin.
However, in practical application, we all hope to avoid low speed floating point operation, in order to improve efficiency, the above formula is changed into integer operation and shift operation. Here the shift operation formula will be used:
If you want to know the specific origin, you can learn by yourself, here is not too much explanation.
The specific implementation algorithm is as follows:
@param bit_map image pixel array address (ARGB format) @param width @param height @return grayscale image prime address */
int * gray_image(int *bit_map, int width, int height) {
double pixel_total = width * height; // Total number of pixels
if (pixel_total == 0) return NULL;
// Grayscale pixel storage
int *gray_pixels = (int *)malloc(pixel_total * sizeof(int));
memset(gray_pixels, 0, pixel_total * sizeof(int));
int *p = bit_map;
for (u_int i = 0; i < pixel_total; i++, p++) {
// Separate the three primary colors and transparency
u_char alpha = ((*p & 0xFF000000) > >24);
u_char red = ((*p & 0xFF0000) > >16);
u_char green = ((*p & 0x00FF00) > >8);
u_char blue = (*p & 0x0000FF);
u_char gray = (red*38 + green*75 + blue*15) > >7;
if (alpha == 0 && gray == 0) {
gray = 0xFF;
}
gray_pixels[i] = gray;
}
return gray_pixels;
}
Copy the code
In this algorithm, in order to unify the compatibility of various platforms, the input parameter requires bitmap in ARGB format. The reason why int is used instead of unsigned int is because there are no unsigned data types in Java and using int is universal.
OTSU algorithm:
OTSU algorithm is also known as the maximum class difference method, sometimes called OTSU algorithm, proposed by OTSU in 1979, is considered to be the best algorithm of threshold selection in image segmentation, simple calculation, is not affected by image brightness and contrast, so it has been widely used in digital image processing. It divides the image into background and foreground according to the gray characteristics of the image. Since variance is a measure of the uniformity of gray distribution, the greater the inter-class variance between background and foreground, the greater the difference between the two parts of the image. When part of the foreground is wrongly divided into background or part of the background is wrongly divided into foreground, the difference between the two parts will become smaller. Therefore, the segmentation that maximizes the variance between classes means the smallest misclassification probability.
Principle:
For image I (x, y), the segmentation threshold of foreground (i.e. target) and background is denoted as T, the proportion of foreground pixel points to the whole image is denoted as ω0, and its average gray level is μ0. The proportion of background pixel points to the whole image is ω1, and its average gray level is μ1. The total mean gray level of the image was denoted as μ, and the interclass variance was denoted as g.
Assuming that the background of the image is dark and the size of the image is M × N, the number of pixels in the image whose gray value is less than the threshold T is denoted as N0, and the number of pixels whose gray value is greater than or equal to the threshold T is denoted as N1, then:
Omega 0 = N0 / M * N (1) 1 = omega N1 / M * N (2) the N0 + N1 = M * N (3) the omega 0 + 1 = 1 (4) mu omega = 0 omega * mu mu omega 0 + 1 * 1 (5) g = 0 * omega (mu 0 μ)^2 + ω1 * (μ 1-μ)^2 (6)Copy the code
Substituting Equation (5) into Equation (6), the equivalent formula is obtained:
G = ω0 * ω1 * (μ 0-μ 1)^ 2    (7)Copy the code
Formula (7) is the calculation formula of the variance between classes. The threshold T that maximizes the variance G between classes is obtained by ergodic method.
Because OTSU algorithm is based on gray histogram data to calculate the threshold, the first two steps of OTSU algorithm are used:
1. Obtain the grayscale of the original image
2, gray square statistics
Here, the image needs to be traversed for several times. If each step is processed separately, the number of traverses will be increased. Therefore, step integration is done here to reduce unnecessary traversal and improve performance.
The specific implementation algorithm is as follows:
/** OTSU algorithm to obtain the binary map @param bit_map image pixel array address (ARGB format) @param width @param height @param T store the calculated threshold @return Binary image prime set address */
int * binary_image_with_otsu_threshold_alg(int *bit_map, int width, int height, int *T) {
double pixel_total = width * height; // Total number of pixels
if (pixel_total == 0) return NULL;
unsigned long sum1 = 0; // Total gray value
unsigned long sumB = 0; // Total background gray value
double wB = 0.0; // Ratio of background pixels
double wF = 0.0; // Foreground pixel ratio
double mB = 0.0; // Background average gray value
double mF = 0.0; // Foreground average gray value
double max_g = 0.0; // Maximum variance between classes
double g = 0.0; // Interclass variance
u_char threshold = 0; / / threshold
double histogram[256] = {0}; // Gray histogram, subscript is gray value, save content is the total number of pixels corresponding to gray value
// Get the gray histogram and total gray
int *gray_pixels = (int *)malloc(pixel_total * sizeof(int));
memset(gray_pixels, 0, pixel_total * sizeof(int));
int *p = bit_map;
for (u_int i = 0; i < pixel_total; i++, p++) {
// Separate the three primary colors and transparency
u_char alpha = ((*p & 0xFF000000) > >24);
u_char red = ((*p & 0xFF0000) > >16);
u_char green = ((*p & 0x00FF00) > >8);
u_char blue = (*p & 0x0000FF);
u_char gray = (red*38 + green*75 + blue*15) > >7;
if (alpha == 0 && gray == 0) {
gray = 0xFF;
}
gray_pixels[i] = gray;
// Calculate the grayscale Histogram distribution. The subscript of the Histogram array is the grayscale value, and the saved content is the pixel points corresponding to the grayscale value
histogram[gray]++;
sum1 += gray;
}
/ / OTSU algorithm
for (u_int i = 0; i < 256; i++)
{
wB = wB + histogram[i]; // There is no scale here, so it does not affect the calculation of T
wF = pixel_total - wB;
if (wB == 0 || wF == 0)
{
continue;
}
sumB = sumB + i * histogram[i];
mB = sumB / wB;
mF = (sum1 - sumB) / wF;
g = wB * wF * (mB - mF) * (mB - mF);
if(g >= max_g) { threshold = i; max_g = g; }}for (u_int i = 0; i < pixel_total; i++) {
gray_pixels[i] = gray_pixels[i] <= threshold ? 0xFF000000:0xFFFFFFFF;
}
if (T) {
*T = threshold; // OTSU algorithm threshold
}
return gray_pixels;
}
Copy the code
Test execution time data:
IPhone 6: imageSize: 260, 260; OTSU use time: 0.005254; Usage time of 5 asynchronous processing: 0.029240
IPhone 6: imageSize: 620, 284; OTSU service time: 0.029476; 5 times of asynchronous processing: 0.050313
IPhone 6: imageSize: 2560,1440; OTSU service time: 0.200595; 5 times of asynchronous processing: 0.684509
After testing, the algorithm processing time is millisecond level, and generally our image size is not large, so the performance is no problem.
Effect after treatment:
The binary graph processed by OTSU algorithm can basically satisfy most business logos.
However, there are some shortcomings for the actual scene, for example, when the color of the logo of the business is relatively different, the printed picture may not be consistent with the business will. For example, the following logo:
For the algorithm, the gray value of the yellow logo is smaller than the threshold value, so the binarization becomes white. However, for the merchants, part of the information in the red box of the logo is missing, which may not meet the needs of the merchants.
- Summary of existing Problems
- Single algorithm, for different images processing results may be inconsistent with expectations
- The image is processed every time it is printed and there is no caching mechanism
New solutions
In view of the two problems existing in the previous scheme, the new scheme adds specific optimization.
4.1 Problem I (Single algorithm, processing results may be inconsistent with expectations for different images)
Multiple algorithms were added to calculate the threshold T, and then the binary graph obtained by each algorithm was compared with the gray scale of the original image, and the one with higher acquaintance was taken as the optimal threshold T.
Flow chart:
In the whole process, three algorithms will be parallel for binary graph processing, and the image fingerprint hashCode of the binary graph will be obtained, which will be compared with the original image fingerprint hashCode to obtain the closest binary graph to the original image as the optimal binary graph.
OTSU algorithm has been described above, this time for the average gray algorithm and bimodal average algorithm for analysis.
Average gray scale algorithm:
Average gray algorithm is actually very simple, is the image gray processing, the average gray map. Assuming that the total gray level is sum and the total pixel point is pixel_total, then the threshold T is:
The specific implementation algorithm is as follows:
/** Average grayscale algorithm to obtain binary map @param bit_map image pixel array address (ARGB format) @param width @param height @param T store calculated threshold @return Binary image prime set address */
int * binary_image_with_average_gray_threshold_alg(int *bit_map, int width, int height, int *T) {
double pixel_total = width * height; // Total number of pixels
if (pixel_total == 0) return NULL;
unsigned long sum = 0; / / the total gray
u_char threshold = 0; / / threshold
int *gray_pixels = (int *)malloc(pixel_total * sizeof(int));
memset(gray_pixels, 0, pixel_total * sizeof(int));
int *p = bit_map;
for (u_int i = 0; i < pixel_total; i++, p++) {
// Separate the three primary colors and transparency
u_char alpha = ((*p & 0xFF000000) > >24);
u_char red = ((*p & 0xFF0000) > >16);
u_char green = ((*p & 0x00FF00) > >8);
u_char blue = (*p & 0x0000FF);
u_char gray = (red*38 + green*75 + blue*15) > >7;
if (alpha == 0 && gray == 0) {
gray = 0xFF;
}
gray_pixels[i] = gray;
sum += gray;
}
// Calculate the average gray level
threshold = sum / pixel_total;
for (u_int i = 0; i < pixel_total; i++) {
gray_pixels[i] = gray_pixels[i] <= threshold ? 0xFF000000:0xFFFFFFFF;
}
if (T) {
*T = threshold;
}
return gray_pixels;
}
Copy the code
Two-peak average algorithm:
This method is applicable to images with obvious bimodal histograms by looking for the bottom of the bimodal as the threshold value, but this method may not be able to obtain the threshold value, for those with flat histograms or single-modal images, this method is not suitable. The implementation of this function is an iterative process, before each treatment to determine histogram data, to see if it is already a bimodal histogram, if not, the radius histogram data 1 (window size is 3) smooth, if after a certain number such as 1000 iterations still did not get a bimodal histogram, If the function is successfully obtained, the final threshold takes the average value of the two peaks as the threshold. Therefore, the steps to achieve the algorithm are as follows:
1. Obtain the grayscale of the original image
2, gray square statistics
3, smooth histogram
4. Calculate the average value of the two peaks as the threshold T
The third step of smoothing the histogram is an iterative process. The flow chart is as follows:
The specific implementation algorithm is as follows:
// Check if it is a bimodal histogram
int is_double_peak(double *histogram) {
// Determine whether the histogram has two peaks
int peak_count = 0;
for (int i = 1; i < 255; i++) {
if (histogram[i - 1] < histogram[i] && histogram[i + 1] < histogram[i]) {
peak_count++;
if (peak_count > 2) return 0; }}return peak_count == 2;
}
/** Bimodal average algorithm to obtain binary graph @param bit_map image pixel array address (ARGB format) @param width @param height image height @param T store calculated threshold @return Binary image prime set address */
int * binary_image_with_average_peak_threshold_alg(int *bit_map, int width, int height, int *T) {
double pixel_total = width * height; // Total number of pixels
if (pixel_total == 0) return NULL;
// Gray histogram, subscript is gray value, save content is the total number of pixels corresponding to gray value
double histogram1[256] = {0};
double histogram2[256] = {0}; // The process of calculating the mean will destroy the previous data, so two copies of the data are required
u_char threshold = 0; / / threshold
// Get the gray histogram
int *gray_pixels = (int *)malloc(pixel_total * sizeof(int));
memset(gray_pixels, 0, pixel_total * sizeof(int));
int *p = bit_map;
for (u_int i = 0; i < pixel_total; i++, p++) {
// Separate the three primary colors and transparency
u_char alpha = ((*p & 0xFF000000) > >24);
u_char red = ((*p & 0xFF0000) > >16);
u_char green = ((*p & 0x00FF00) > >8);
u_char blue = (*p & 0x0000FF);
u_char gray = (red*38 + green*75 + blue*15) > >7;
if (alpha == 0 && gray == 0) {
gray = 0xFF;
}
gray_pixels[i] = gray;
// Calculate the grayscale Histogram distribution. The subscript of the Histogram array is the grayscale value, and the saved content is the pixel points corresponding to the grayscale value
histogram1[gray]++;
histogram2[gray]++;
}
// If it is not bimodal, the histogram is smoothed by averaging the three points
int times = 0;
while(! is_double_peak(histogram2)) { times++;if (times > 1000) { // Use 1000 times here, considering that multiple loops may cause performance problems
return NULL; // It seems that the histogram cannot be smoothed to bimodal, returns an error code
}
histogram2[0] = (histogram1[0] + histogram1[0] + histogram1[1) /3; / / the first point
for (int i = 1; i < 255; i++) {
histogram2[i] = (histogram1[i - 1] + histogram1[i] + histogram1[i + 1) /3; // The middle point
}
histogram2[255] = (histogram1[254] + histogram1[255] + histogram1[255) /3; // One last point
memcpy(histogram1, histogram2, 256 * sizeof(double)); // Back up the data for the next iteration
}
// select the threshold T
int peak[2] = {0};
for (int i = 1, y = 0; i < 255; i++) {
if (histogram2[i - 1] < histogram2[i] && histogram2[i + 1] < histogram2[i]) {
peak[y++] = i;
}
}
threshold = (peak[0] + peak[1) /2;
for (u_int i = 0; i < pixel_total; i++) {
gray_pixels[i] = gray_pixels[i] <= threshold ? 0xFF000000:0xFFFFFFFF;
}
if (T) {
*T = threshold;
}
return gray_pixels;
}
Copy the code
Test execution time data:
IPhone 6: imageSize: 260, 260; Average_peak Usage time: 0.035254
IPhone 6: imageSize: 800, 800; Average_peak Usage time: 0.101282
After testing, the algorithm is ok when the picture is relatively small. If the picture is relatively large, there is a large performance consumption, and according to the different color distribution of the picture, it may cause multiple cycles of smoothing, which will also affect the performance. For logo, we have compressed the processing, which is usually very large, so the processing time is within acceptable return, and the processing and comparison is conducted in an asynchronous thread, which does not affect the main flow.
Image fingerprint hashCode:
Image fingerprint hashCode, which can be understood as the unique identity of the image. A simple image fingerprint generation step involves the following steps:
1. The image is usually reduced to 8 * 8, with a total of 64 pixels.
2. Convert the reduced picture to a grayscale image.
3. Calculate the average grayscale of grayscale map.
4. The grayscale of each pixel in the grayscale map is compared with the average grayscale. Greater than the average gray level, denoted as 1; Less than the average gray level, denoted as 0.
5. Calculate the hash value, and the result of step 4 will form a 64-bit integer that is the image’s fingerprint hashCode.
6. Compare the fingerprint Hashcodes generated by different images and calculate how many 64 bits of the two Hashcodes are different, namely “Hamming distance”. The less the differences, the closer the images are.
Since the image fingerprint generated by this algorithm has a large difference, the similarity of the processed binary graph compressed to 8 * 8 is very high for logo, so the hashCode generated by 8 * 8 has a large error, which is proved to be true after the test. Therefore, on this basis, the steps 1, 5 and 6 above are improved. The improved steps are as follows:
1, the image size can be customized (must be integer), but the minimum number of pixels should be 64, that is, width * height >= 64. Multiple of 64 is recommended to reduce error.
5. The hash is not a 64-bit integer, but an array of 64-bit integers whose length is the largest multiple of 64 pixels. Each 64-bit hashCode generated is then added to the array, which is the image fingerprint.
6. When comparing different fingerprints, traverse the number group and compare different bits of each 64-bit integer. The final result is the sum of different bits of each 64-bit integer.
In our business logo testing practice, we found that using 128 * 128 compression, can get more satisfactory results.
The optimal algorithm is OTSU algorithm example:
The optimal algorithm is the average grayscale algorithm example:
The optimal algorithm is the bimodal mean algorithm example:
In the actual experiment, it is found that the probability of choosing the two-peak mean is relatively low, that is, the majority of logos are selected between OTSU and average gray level algorithm. Therefore, it can be considered to add selection statistics in the future. If the probability of bimodal mean is really low and the result is almost as large as the other two methods, then this method can be removed.
4.2 Problem 2 (Images are processed every time they are printed, and there is no caching mechanism)
Add the caching mechanism, the general store logo and store TWO-DIMENSIONAL code are fixed, rarely changed, so, when entering the store and modify the store two-dimensional code can be preprocessed, and the cache after the image printing instructions, the subsequent printing directly take the cache can be used.
Since the contents of the cache are processed print instruction strings, they are stored using NSUserDefaults.
Flow chart of cache policy:
Why only modify the qr code of the store, but not the store logo? In our APP, the logo cannot be modified and can only be modified in the BACKGROUND of PC. After logging in to the store, the store information can be directly obtained locally. Store QR code is a self-uploaded picture in the receipt template setting, so merchants can modify the store QR code by themselves in the APP.
Printing picture processing flow chart:
In the new process, if the image is not found in the cache, the old scheme is used to process the image. The reason is that at this time, merchants print receipts in real time. If the new scheme is selected, I’m afraid the time will be longer and the user experience will be reduced. The old scheme has been running online for a long time, so using the old scheme is not a problem.
5. Future expectations and planning
Several optimizations were added into the subsequent planning:
- Add new process processing statistics, and make statistics of the optimal algorithm after processing business logo and store TWO-DIMENSIONAL code, so as to prepare data for subsequent optimization.
- If the business is not satisfied with the result after processing, the business can choose to deal with the threshold T of the binary graph until it is satisfied.
- The picture cannot be updated in a timely manner. The local cache cannot be updated after the picture is modified on the PC background.
- Picture fine processing, two-dimensional code can be used for block processing algorithm.
In the second point, the threshold value T is independently selected by merchants, and the preview effect is as follows:
Refer to the link
- Gray scale calculation formula, by Zyl910
- Otsu algorithm, by Wiki
- Image processing algorithm, by LaviewPbt