How to complete the reconciliation of orders above 10 million level per day (I)

Order reconciliation Merchant dimension reconciliation dependence & AMP; amp; amp; JVM optimization for young and old generation selection code optimization other ideas summary 1024 Overview a few days ago published a reconciliation of the warm up, now to a dry matter.

A few days ago published a pre-heating account, now to a dry goods. The article is fine but not much, much also waste everyone’s time. Therefore, this is also one of the reasons why I gave up the original public account and registered this service account to share experience. A few posts a month is enough.

Usually, I seldom log in the background of the public account. If you need to contact me, you can send me an email through my blog.

This series is divided into two parts. This paper mainly explains the analysis and architecture of the ten-million-level order reconciliation system, as well as the pits encountered in the actual project, and the solutions.

Are not covered in this series on involved in the middleware, database, framework, plug-ins and other basic knowledge, if you learn basic knowledge or project build, can focus on my lot/blog (https://github.com/chenhaoxiang), don’t push the project code/article on a regular basis. This article will not use proper terms in professional reconciliation, feel free to read.

In the early stage, the system was developed as a backup system, so there was no much attention to it. It was reconstructed twice, and now the bottleneck of how much data to support the reconciliation is completely in Redis. At present, the reconciliation of nearly ten million orders is about 2G, and the peak of server memory is used.

At present, the development of the second phase reconciliation system (the first phase and the second phase reconciliation system are separate, not reconstruction) is also in progress (for the reconciliation of orders of 100 million level), and how to complete the reconciliation of orders of more than 10 million level per day will be discussed later (II).

The basic 5 steps of reconciliation, according to the normal reconciliation to go, basically can not do without the following 5 steps

\1. Data loading (the cache of data is indispensable, and the need for re-reconciliation is very common)
\2. Data comparison (batch comparison is very important, server memory is henceforth independent of order volume)
\3. Difference entry (pay attention to the amount of difference data verification, otherwise millions, millions of difference entry at a time, you can imagine)
\4. Difference processing (automatic processing is not recommended, but can be set to one-key processing, but must be confirmed by personnel)
\5. Clear the cache (always clear the cache in memory)

If you draw a picture, it looks something like this:

Query orders, daily tens of millions of levels of order data, if the use of the usual paging query, so the speed of the query will be more and more slow. It is recommended to query the minimum AND maximum ids first based on time, and then query the order data in batches based on the ID.

In the first-phase system, I used Redis as the order data cache and order comparison, and divided the orders by taking mold. The advantage of this is that horizontal scaling is very convenient. Don’t worry about growing your business. The disadvantage is that we rely on Redis server. Because Redis is a single thread, even if we add servers and transfer data in batches to Redis for data comparison, the reconciliation speed will be slower and slower with the increase of business (Redis cluster and batch transmission comparison can be used to solve this problem).

Attention! Redis server and server must be in an Intranet data transmission! Otherwise, the speed will make you desperate

The system spends the most time downloading files and loading file data.

Not to mention downloading, the bandwidth of the FTP server on the channel side is only that large.

Mainly loading files, we can deal with, the first phase of the system is the use of single thread loading, and is to load objects, loading and serialization need time can not be ignored, here consumes more time. Nearly 10 million data takes about 10 minutes, which is unacceptable.

Serialization of Protostuff is highly recommended (also faster than JSON serialization, kryo is not recommended). Do not use Java native serialization. Protostuff is much better than Java native in terms of both performance and memory requirements (in fact, OpencSV can be optimized to load reconciliation data without using objects, as shown in the next installment of the reconciliation system). To Redis, you can select a string or use Protostuff serialized into a byte stream.)

Some design patterns should be applied in the system, for example: reconciliation can use policy abstract factory mode, each reconciliation implementation corresponds to a specific strategy implementation, and as far as possible the system implementation of fine granularity, convenient decoupling and convenient reuse.

Merchant dimension reconciliation is to verify merchant income, and disbursement. So it’s very important. The previous order reconciliation can be understood as serving for merchant reconciliation. The basic steps are similar to order reconciliation, and this dimension is not explained too much

Dependence & Characteristics

Many third-party tools were also used in the project, as shown below

Some design patterns and system features are also used

Matters needing attention

\1. The first phase of the system relies on OpencSV to parse CSV files into objects. Because OpencSV uses multiple threads and Netty to read file data into the List, the memory out of the heap overflow once (OOM). The solution can be to expand the off-heap memory, or to disable Netty’s use of off-heap memory and switch to the heap. Expand out-of-heap memory:

-XX:MaxDirectMemorySize=1024m 
Copy the code

To disable netty’s use of off-heap memory:

-Dio.netty.noPreferDirect=true \
-Dio.netty.leakDetectionLevel=advanced \
Copy the code

In this case, depending on the situation, if your server has enough memory, you can just expand the out-of-heap memory. After all, disabling Netty’s use of off-heap memory will affect the speed of parsing files to some extent

You can also choose to parse the CSV file yourself, which is quite convenient, as I have tried, but there is a lot of special data to deal with. For example, CSV files are comma-separated columns, and some order names contain commas, which requires special treatment. Or the number of strong string and so on, if their own processing, need to make their own special judgment, in speed and reliability, in fact, is not as good as OpencSV processing. So we finally decided to use OpencSV to parse CSV files.

2. There is a point in OpencSV that can be improved for reconciliation. Due to the frequent insertion operation of reconciliation data, it is not recommended to use array collection, but strongly recommended to use linked list collection. Csvtobean.parse () in OpencSV uses ArrayList. You can rewrite this class and the CsvToBeanBuilder class using decorator mode, using LinkedList. You can also use reflection to dynamically proxy the implementation of this method. Through practice, after the use of linked list set, the speed of account checking was increased by about 1 minute

\3. How to quickly locate the account checking when there is a problem? In the account checking, there will be problems inevitably. In the initial stage of system operation, we have encountered various problems. Unionpay/platform data error, payment channel data is incomplete, some data is not generated in accordance with the format, extra special symbols, resulting in parsing error, Redis transmission data timeout, etc

I. As for the data errors on the platform side and incomplete data on the channel side, this is completely out of control, and the initiative is in the hands of others. What you can do is quickly locate the problem. In the early stage, when there was a problem with the channel data for the first time, the whole positioning took more than an hour, resulting in a delay of several hours for the merchant’s funds and a huge amount of capital loss for the company. This is totally unacceptable and a consequence of my lack of consideration. Then in view of the island’s problems and channel data problems, has carried on the MD5 checksum, file size calibration, and orders, any one instant by robot sent to nailing group, behind also had several unionpay data and the problem of channel order data, are all within minutes by phone corresponds to the company’s developer, Verify/rebuild/validate the new schema. Make sure problems are solved immediately.
Ii. Special symbols are rare. The only symbols that need to be handled are Spaces and tabs
Iii. Redis takes too long to transmit data, which will also cause the connection to be closed. Remember to set the timeout time to be longer

The optimization of the JVM

In the first phase of the system run early, OOM events have also occurred several times, here, also introduced how to carry out JVM optimization, prevent OOM

Java heap can be easily divided into new generation and old generation. New generation: Stores objects with short life cycles. Old age: Stores objects with a long life cycle (simply, objects that have not been collected after several GC sessions).

In the reconciliation system, the daily running time is only a few tens of minutes. In those tens of minutes, tens of millions of objects are generated in the for loop, such as unrepeatable strings like order numbers. Therefore, the memory allocation for the new generation must be equal to/higher than the memory allocation for the old generation

About the choice of young and old generation

Young generation size selection: Response time first applications, as large as possible, until close to the system’s minimum response time limit (selected based on the actual situation). In this case, the frequency of GC occurring in young generation collections is minimal. At the same time, it can reduce the number of objects reaching the old generation. Throughput first applications are also set as large as possible because there is no response time requirement and garbage collection can be done in parallel.
Aged generation size selection: Response time first applications, aged generation generally use concurrent collectors, so the size needs to be carefully set, generally take into account some parameters such as concurrent session rate and session duration. If the heap is set too small, it can cause memory fragmentation, high recycle frequency, and application pauses to use traditional token cleanup. If the heap is large, it takes longer to collect.

I chose to adjust the ratio of new generation to old generation to 2:1 (please choose by yourself, 1:1 is recommended here). In addition, if the object usage in the system is not good, you can call GC explicitly regularly to speed up the collection of garbage objects

After optimization effect is obvious, 600 w data reconciliation speed is 62 seconds faster (this time does not include FTP download statement, decryption, decompression time) original setting: -xMS6g -XMx6g-xmn2g

Optimized Settings: -xMS6g -XMx6g-xmn4g

Note, however, that the new generation of GC (Minor GC) generally adopts the replication algorithm, because the prominent feature of this algorithm is that it only cares about what needs to be copied, and the reachability analysis only marks and copies few living objects. You don’t have to go through the heap because most of it is discarded. The disadvantage is obvious, however, that it wastes half of the memory space (optimized to split into Eden and survivor, which is not discussed here). The advantage is that GC is fast (typically 10 times faster than the older Major GC).

Code optimization

\1. Do not use Log4j to output file names and line numbers, because Log4j does this by printing a thread stack, generating a large number of strings. In addition, when using Log4j, it is recommended to check whether logs of the corresponding level are opened before performing operations. Otherwise, a large number of strings will be generated. Slf4j interfaces delay String concatenation until printing time to avoid unnecessary overhead caused by premature String concatenation. Note, however, that the format is used instead of + concatenation.

\2. String concatenation over 100W data for loop, + sign concatenation is recommended in JDK8 or above. Never use format for concatenation. In the case of 500W data splicing

The + sign efficiency is about 30 times that of format
The + sign is about twice as efficient as buulder

There are data to prove it:

The test code is as follows:

public static void main(String[] args) { long st; long et; int size = 5000000; st = System.nanoTime(); for (int i = 0; i < size; i++) { format("format" + i, 21); } et = System.nanoTime(); System.out.println("format " + (et - st) / 1000000 + "ms"); st = System.nanoTime(); for (int i = 0; i < size; i++) { plus("plus" + i, 21); } et = System.nanoTime(); System.out.println("plus " + (et - st) / 1000000 + "ms"); st = System.nanoTime(); for (int i = 0; i < size; i++) { concat("concat" + i, 21); } et = System.nanoTime(); System.out.println("concat " + (et - st) / 1000000 + "ms"); st = System.nanoTime(); for (int i = 0; i < size; i++) { builder("builder" + i, 21); } et = System.nanoTime(); System.out.println("builder " + (et - st) / 1000000 + "ms"); st = System.nanoTime(); for (int i = 0; i < size; i++) { buffer("buffer" + i, 21); } et = System.nanoTime(); System.out.println("buffer " + (et - st) / 1000000 + "ms"); } static String format(String name, int age) {return String. Format (" use %s, this year %d ", name, age); } static String plus(String name, int age) {return "+ name +", "+ age + "; } static String concat(String name, int age) {return "use ".concat(name).concat(", ").concat(string.valueof (age)).concat(" age "); } static String builder(String name, int age) { StringBuilder sb = new StringBuilder(); Sb. Append (" use "). Append (name), append (", "this year), append (age), append (" age"); return sb.toString(); } static String buffer(String name, int age) { StringBuffer sb = new StringBuffer(); Sb. Append (" use "). Append (name), append (", "this year), append (age), append (" age"); return sb.toString(); }Copy the code

You can check by yourself. After JDK5, the JDK uses StringBuilder for + string concatenation, so for convenience, you don’t need to worry about using + or StringBuild at all. Within 10W loops, StringBuild concatenation is about 1.1 times more efficient than the + sign, which is negligible. The time it takes is 59ms to use + concatenation for 10W times and 50ms to use StringBuild. Over 10W loops, + sign stitching takes less time than StringBuild, and the more loops, the more significant the difference. (The time required for actual concatenation depends on the configuration of your PC.) Use the + sign boldly when there is no concurrency

In actual reconciliation, concatenating string reconciliation using format takes time:

Format concatenation optimized after + sign concatenation:

\3. Do not use the finalizer method, which will affect the GC execution

\4. Use try-with-resources to automatically close streams (supported by JDK7 or above).

\5. Try not to load classes dynamically in a for loop, and always cache them if necessary

6. Avoid replace/replaceAll (apache commons-lang StringUtils) in the for loop

10 million level data in the morning peak reading line of the order from the library, it is recommended to read 1W data for about 10ms sleep (recommended in the middle of the night cache)

\8. Do not read/save the data set to Redis at one time, of course, you can do so (remember to set the timeout time too long, otherwise there will be Redis response timeout).

The simplest way to deal with this is to take the mold of the order number (but it is more recommended to use charAt/ Substring to take one or several random numbers in the order number for stitching Key, because the order number may not be a number, our company’s is not…). , stored in batches into the set under different keys of Redis, so that even if the data reaches tens of millions or hundreds of millions, only additional servers are needed for distributed reconciliation. It is completely possible to control the time within the scope of the reconciliation of one hundred thousand level (it does not rule out the possibility that the one digit of the ten million data order number is all the same, and the redistribution of this situation needs to be considered). CharAt method ();

public char charAt(int index) {
    if ((index < 0) || (index >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return value[index];
}
Copy the code

Thanks to the internal implementation of String, it is possible to return characters at the current position of value very quickly. Calling charAt takes almost no time. Ten million level data call charAt method, about 100ms more time.

\9. Split the reconciliation file, and use multi-threading to read the actual reconciliation required strings (such as order number, amount, handling fee, status and other necessary fields) stored in Redis, you can speed up a lot.

\10. Do not use Java native serialization! FST and Protostuff are recommended here.

\ 11. Special symbol of reconciliation document must handle in place, such as tabs, Spaces, and so on, even if you don’t, but must transform or judgment, prevent special happens suddenly one day, plus some judgment, replacement, must order level check for time delay about 2 minutes, and time will linearly with orders.

\12. The logic of each account reconciliation may be different, but if you can extract common steps or methods, you must extract classes or methods. Do not allow code redundancy, otherwise, the later maintenance or code readability is very poor.

\13. Do not load reconciliation data into array collection. Select LinkList or set. If multithreading, select a thread-safe collection. (Note that the timing of some of the tests in this article varies depending on server performance.)

Recently, I was learning blockchain, and I was thinking that I could apply some knowledge points of blockchain to reconciliation, for example, using Merkle tree for reconciliation of orders, using RocksDB to store order data comparison and so on.

It is feasible to use Merkle tree for order reconciliation, but in practice, after testing, a HASH (O(n*x)) takes about twice the length of order data of Set alignment (O(n)). Unacceptable, discard (n for order quantity, x for length of each order data string).

In addition, RocksDB is used for order data storage and model comparison, which has been applied in the second phase reconciliation system. For the time being, my knowledge of blockchain is still superficial, and I will continue to learn it (I will consider writing a dry article about blockchain learning/project later).

At present, the first-phase system has been running stably for several months, and there has been no problem in the recent more than a month.

The bottleneck of the first-stage system reconciliation is Redis, and if the system needs to be changed, it will be very laborious. This is one of the reasons for the development of the next phase of the reconciliation system.

First of all, the architecture design of the second-phase system is definitely better than that of the first-phase system. Many modules in the reconciliation have been extracted to facilitate reuse and reconstruction. Second, with RocksDB local storage, there is no concern that the order volume is too large and Redis does not have enough memory to reconcile accounts. Multithreading key is used to store data, which can be compared in batches when comparing different data, thus relieving the memory pressure of the server.

In addition, do not use objects to read file order data, use string to read and compare order data. Serialization is no longer required, which is faster and saves memory. Finally, the business side (**** department of our company) added some other requirements, and it was difficult to continue to support the original phase I system (the consequence of continued support is that the development is difficult, and it will be more and more difficult to maintain the later ones), which was also one of the reasons for the development of the phase II system.

Later, there will be an account reconciliation development of the phase II system (template codes of some architectures will be attached will be considered).

A system of reconciliation development is introduced here, I hope to help you, I believe that serious people will have a certain receipt of goods.

If you have a good plan/suggestion, you can leave a message directly, the adopted plan/suggestion will be carried out in the next article to thank you.

Today is 1024 program ape section, delimit the key point to convey to everyone, do not work overtime do not work overtime! Our benefits:

Does everybody look like they need food? So I, help you eat ~ forget to help you take our HR little sister photos. Take a clip from the small video

Some company benefits:

It will be good to see ~ I heard that our company will have it next year, ha, ha, ha, ha ~~

Finally, there are other corporate perks:

I wish you to get off work early every day, no girlfriend to find a girlfriend early, no boyfriend can come to me (skin I am very happy).

How to complete the reconciliation of orders above 10 million level per day (I)

Dependence & Characteristics

Matters needing attention

The optimization of the JVM

About the choice of young and old generation

Code optimization

Related Posts

When I’ve answered 18 JVM questions! I thought I could do it!

The buffer pool in Mysql

Sqrt(x) (python)