The background,

In the past 2017, Alipay’s offline scene continues to expand, the collection code, word of mouth, shared bikes, charging banks, parking fees and other products make our life more and more convenient. Qr code has become the most important online connection tool because of its low cost and good compatibility, and therefore faces more new challenges. Because THE TWO-DIMENSIONAL code is a dot-matrix information coding method, any visual defects, bending and light effects will greatly affect the success rate of recognition, if the recognition is difficult, it means that users may choose to give up, affect the payment experience and affect the user’s mind.

The most critical factors for users’ scanning experience are as follows:

1. Recognition rate: This is the basic index of code scanning service. The recognition rate can directly reflect the recognition ability.

2. Identification time: it includes app startup time and image recognition time, which measures the time it takes for a user to click the app and correctly identify the content. Every additional 1s, a large number of users will give up waiting and leave;

3. Accurate feedback: The identification results not only need to be timely feedback to the user, but also need to be very accurate, especially in the current offline scene with multiple TWO-DIMENSIONAL code, need to avoid the user’s second operation;

This article will share from the above three aspects how alipay scanning code technical team to create a accurate, fast and stable extreme scanning code experience for users.

Two, improve the recognition rate

We did a lot of statistical analysis of user feedback, and found that most of the failure was due to the fact that the QR code was not standard, and unfortunately when we tested the recognition rate with our earlier scan version, it was only 60%;

* * strategy 1: optimize the pile point search algorithm is the aspect ratio of resistance in the past and code algorithm, check allowed difference when the aspect ratio of 40%, but due to the use of the previous to the error of judgment results associated with the order of aspect of, this can lead to some of the aspect ratio of disorder yards, transversely scanning not to come out, but it can rotate 90 degrees to vertical sweep out ^ ^ (OMG).

Optimization strategy

  • By modifying the decision rule of aspect ratio, the aspect ratio will no longer be affected by the order;

  • For the known length, modify the rules to expand the acceptable width range and enhance the tolerance of aspect ratio;

In our comparison test set, the recognition rate increased by about 1%.

Strategy 2: Add 1:5:1 pile point recognition pattern

In a picture, to find the QR code, the key is to find the registration point of the QR code features:

Three corners of the back font pattern, this is the QR code feature location. The ratio of black and white patches in the middle area is 1:1:3:1:1

In the previous code scanning algorithm, the pile point identification method is to find 11311 mode through the state machine, and then take the middle position to determine the X position (at this time, the scan line is at the ratio of 11311 in the first line). The 11311 mode is searched longitudinally at the x position, the y position is determined, and then the 11311 proportion is searched horizontally at (x,y) position. Fix the x position. As long as interference points are encountered in any 11311 mode search, even the pepper and salt noise of one pixel can make the pile point search fail. (Paying the stake points in sapphire blue will generate a large number of noise points in the blue area, resulting in low recognition rate)

Therefore, we add a new way to identify the pile point. When the state machine reaches mode 151, an attempt is made to confirm the pile point. (The scan line is at the scale of 151 on the first line).

The optimization effect

  • The new search method will no longer be affected by the dirt on the center or edge of the pile point, and the recognition rate of paying royal blue pile point code is significantly improved.

  • After modification, the overall recognition rate is improved by nearly 1%, but the time of recognition failure is improved.

Policy 3: Add a diagonal filtering rule

Before enumerating all possible pile point combinations O(N^3), a diagonal check is performed on all suspicious pile points. Since the diagonal line of pile point should also meet the 11311 mode, using this rule to do a filter is suspicious and effective to reduce the amount of calculation, which effectively reduces the time of identifying success and failure.

Strategy 4: QR code classifier based on Logistic Regression

In the previous code scanning algorithm, after obtaining three pile points, three values are checked based on the included Angle, length deviation and unit length, and a simple formula is used to calculate the threshold value to determine whether it is a possible TWO-DIMENSIONAL code, with a high probability of misjudgment.

To this end, we introduce the logistic regression algorithm model in machine learning. Based on the rich two-dimensional code data set of Alipay, the logistic regression model is trained as a two-dimensional code classifier, which significantly reduces the probability of misjudgment, and also significantly reduces the time of recognition failure without two-dimensional code.

Policy 5: Change the number of skip scan intervals

Due to the high resolution of the input camera frame, large number of pixels and large amount of computation, the previous scanning algorithm skipped sampling in both horizontal and vertical directions for calculation. However, in the actual operation, because too many columns were skipped, some points at position 1 in 11311 mode were missed, resulting in the failure of pile point search. IOS Development communication Technology Group: 563513413

By modifying the number of rows calculated by skipping to configurable items, we obtained the most appropriate skipping strategy through the AB gray scale test on the line. After the overall configuration of this skipping strategy, the recognition rate was significantly improved.

The performance of the above optimizations in the test set

In summary, the core recognition ability of scanning code was improved by 6.95 percentage points on the test set of 7744 pictures.

Special strategy optimization

In addition to the above general scan code optimization, we also improve the special scene scan code ability.

1. The distortion? Not afraid!

Offline scenes are complex and varied. The two-dimensional code on the beverage bottle is deformed, the two-dimensional code on the supermarket receipt is rolled up, and the two-dimensional code on the roadside hawker is uneven and even folded. These distorted TWO-DIMENSIONAL code easy to increase the difficulty of recognition, and even lead to recognition failure. In the previous anti-distortion strategy of scanning code algorithm, the mapping relation is established by perspective transformation relation. The advantages are: Good adaptability and can meet most application scenarios. The disadvantages are also obvious: for Version 1 code, the mapping relationship is degraded to affine transformation, so the effect is poor, and the phone must be parallel to the code plane to facilitate identification. When the material surface is not flat, the effect is poor.

Optimization strategy

  • Assuming that the sampling coordinate system follows a more complex mapping relationship with the TWO-DIMENSIONAL code coordinate system, and assuming that the curl on the material surface is small, the mapping relationship can be better fitted by using quadratic functions.

  • The qr code version on the actual invoice is generally greater than or equal to 7. The high version of the QR code has multiple auxiliary registration points, which is more conducive to the construction of the secondary mapping table.

  • Based on the above reasoning, the new mapping is used to replace the old perspective transformation, so as to obtain more accurate sampling.

With the new strategy, the qr code recognition ability of the invoice code scene is improved significantly.

Note: Due to the enhancement algorithm, please align the QR code and wait.

2. Improved fault-tolerant recognition ability

Merchants or suppliers generate qr codes, usually in the middle part of the QR code affixed with a Logo, this part may make the QR code Decode error.

Optimization strategy: For the BitMatrix obtained after sampling, for the points in the middle part of a rectangular region, some strategies are adopted to change the value of the middle point, so that it can pass the fault-tolerant boundary check. There are two strategies, the first is to reverse, and the second is to take random values at each point. So far the rectangular area is 1/4 of the length and 1/4 of the width.

Through this optimization, the fault tolerance of scanning code has been significantly improved.

Third, smaller recognition time

The so-called image binarization is to set the gray value of the pixel on the image to 0 or 255, that is, to present the whole image with obvious black and white visual effects. Below is the original image on the left and the binarization image on the right.

Before decoding the scanning code algorithm, there is binarization calculation. The binarization calculation of the image can greatly reduce the amount of data in the image, and weaken the interference of other information in the case of image blur, weak color contrast, too strong/too weak light, image defaced, etc., which is more conducive to detection and recognition.

The traditional algorithm is to perform binary calculation on THE CPU, which consumes a lot of CPU resources. But in fact, GPU is better at large-scale parallel computing, so we choose to use GPU to do binary calculation. RenderScript on Android, Metal on iOS, very low level frameworks.

The optimization results

1. IOS: unified battery, Angle, light and other environmental variables, and tested five camera binarization algorithms of scan code core on iPhone6. The performance is as follows:

As you can see, Metal has quite a high advantage in image binarization, nearly 150% faster than the original pure CPU processing, while reducing CPU resources by nearly 50 percentage points.

2. There are many Android models. We extracted online data, and it can be seen that GPU significantly reduces the time consuming of a single frame by more than 30% in binarization processing.

Iv. Dispatch and maintain stability

Offline materials are strange, scan code algorithm in order to solve some unsatisfactory scenarios, such as two-dimensional code occlusion, defaced, fuzzy or Angle is very bad special cases, need to use some more time-consuming but more powerful algorithms, but the general case does not need these algorithms. Therefore, we set priority for code recognition and schedule it through time lapse, frame skipping trigger and other methods:

Priority: The three priority levels are tentatively high school, middle school and low school.

  • High priority execution per frame
  • If the priority is medium, the frame rate is reduced
  • Low priority low frame rate
  • The execution time of functions with different priorities can be configured. The priority of different functions can be configured

Special scene algorithm:

A specific capability at the core of a code, such as:

  • Inverse color code recognition ability
  • Fault-tolerant boundary code recognition capability
  • Soiled pile point identification ability, etc
  • Barcode recognition ability