1. Introduction
Distributed system development often involves many scenarios that need to ensure data consistency, such as receiving MQ messages, receiving HTTP requests, and internal business processing. If you’re not familiar with these scenarios or don’t know how to handle them, read on.
2. Receive MQ messages
Receiving MQ messages is a common scenario in distributed system development. The external system pushes MQ messages to the development system, which processes the messages logically after receiving them. On the surface this is a very simple process, but when it comes to data consistency, it’s not so simple.
Why is it not so easy? Take a look at the following scenario:
2.1 Ack messages before processing services
ack(); // Process the business logicCopy the code
In an ideal world, there would be no problem if you first ack the message and then process the business. You may wonder if there is no problem. What else should you consider? Please note that this problem is based on the ideal premise, if the call to the external interface is abnormal during the business process or the database is down, messages will be lost and data inconsistency will occur.
2.2 Processing services and then ACK messages
// handle business logic ack();Copy the code
Since it is not possible to ack the message and then process the business, it is no problem to process the business and then ack the message. Yes, even if the business processing failure message is not ack, the message will be re-consumed without data inconsistency. But this will involve another problem, that is idempotent problem, idempotent problem is not handled properly, or will cause data inconsistency problem.
In fact, processing the business and then ack the message can cause another problem. If the business system has a bug, the message can never be ack, which can cause the message processing into an infinite loop.
It doesn’t work this way, it doesn’t work that way, there’s no solution, right? Of course not. There is a plan. Let me break it down for you
2.3 Combining message ACK mechanism, database, and scheduled task Scheme 1
Try {try{// Query whether the message is processed based on the unique id of the message. If the message is not processed, the service is processed. } catch (Exception e) {// Insert Exception message into database} ack(); } catch (Exception e) { unack(); }Copy the code
The general idea is to determine whether the message is processed according to the unique number of the message. If the message is not processed, the message is processed. If the processing succeeds, the message is ack. If processing fails, the message is stored in the database. If saving to the database still fails, the message is unack, reposted to the message server, and re-consumed until the database is restored. For messages that fail to be processed into the database, you can retry them using scheduled tasks.
This solution solves not only the idempotent problem in 2.2, but also the problem of business bugs causing message processing to go into an infinite loop (limiting the number of retries). However, there is a problem with this solution that is not related to this article, and that is the message backlog problem.
2.4 Combining message ACK mechanism, database, and scheduled task Scheme 2
Try {// Check whether the message exists according to the unique number of the message. If the message does not exist, it is directly inserted into the database. If the message exists, no processing is performed. } catch (Exception e) {unack(); }Copy the code
Comparison of advantages and disadvantages between 2.4 and 2.3
The serial number | advantages | disadvantages |
---|---|---|
2.3 | Messages are inserted into the database only when business processing fails, and the number of messages is not very large | Message processing is slow, resulting in message backlog |
2.4 | Messages are processed asynchronously without backlogs | All messages are stored in the database, and the number of messages can be large |
The two schemes can be freely selected according to the actual situation, and the problem of message backlog can also be dealt with by referring to: How do you deal with message backlog?
3. Receive the HTTP request
Looking at this diagram, you might think that this is a simple process. The development system receives the request, processes the request, and responds to the result. As you might expect, this is simple, but if your development system fails, the peripheral system will retry until the number of retries runs out and the development system does not recover, then the peripheral request data will be lost, causing data inconsistency.
Taking a closer look at the scenario, you can see that the key to the data inconsistency problem lies in the business processing of the development system. This problem can be solved if the development system responds as soon as it receives the peripheral system request correctly.
We only need to receive the HTTP request, write the request content to the database, write success for asynchronous processing and return success; Write failure returns failure, and peripheral retries are used to ensure that data can be written to the database. Requests that fail to be processed can be retried in combination with scheduled tasks.
4. Internal business processing
After the success of the development system in dealing with some business tend to need to notice some peripheral systems and also can’t because peripheral system failure resulting in the current business cannot be handled correctly, since can’t affect the current business, can adopt the way of asynchronous processing, asynchronous treatment will exist the current business success, notify the peripheral failure problem, resulting in the data inconsistency problem.
Is there a way to ensure that the business process is successful and the notification to the periphery is also successful?
We can adopt the scheme of local transaction, insert the data notified to the periphery into the transaction of the current business into the database, asynchronously notify the periphery system, process the failed data and retry with the scheduled task.
5. To summarize
From this article we can see that the solution to the consistency problem is database + retry, so it can be considered in this aspect when solving the consistency problem.