The implementation of distributed transactions mainly includes the following six schemes:
- XA scheme
- TCC scheme
- SAGA scheme
- Local message table
- Reliable message final consistency scheme
- Best efforts to inform the scheme
Two-stage submission plan /XA plan
The so-called XA scheme, namely, two-phase commit, has the concept of a transaction manager that coordinates transactions between multiple databases (resource managers). The transaction manager asks each database are you ready? If each database replies ok, then the transaction is formally committed and the operation is performed on each database; If either of the databases answers no, then the transaction is rolled back.
This distributed transaction scheme is more suitable for distributed transactions across multiple libraries in a single block application, and because it relies heavily on the database level to handle complex transactions, the efficiency is very low, and it is definitely not suitable for high concurrency scenarios. If you want to play, then based on Spring + JTA can be done, their own random search demo see know.
This scheme is rarely used. Generally speaking, if there is such an operation across multiple libraries within a system, it is not compliant. I can tell you, now microservices, a big system divided into dozens or even hundreds of services. In general, our rules and specifications require each service to operate on only one database of its own.
If you want to operate other corresponding library services, are not allowed to direct other service of library, in violation of the micro service architecture specification, you literally crossing random access, hundreds of services, all the broken, the administration of such a service can’t can’t control, there may be data correction by others, their own libraries written by others, and so on and so forth.
If you want to operate on someone else’s service library, you must do so by calling another service’s interface, never allowing cross-access to someone else’s database.
TCC scheme
TCC stands for Try, Confirm, or Cancel.
- Try phase: This phase checks the resources of each service and locks or reserves the resources.
- Confirm phase: This phase is about performing the actual operations in the various services.
- Cancel phase: If the business method execution of any of the services fails, there is a need to compensate by performing a rollback of the business logic that has been successfully executed. (Roll back those that performed successfully)
To be honest, this scheme is rarely used by people, and we use it relatively rarely, but there are scenarios where it is used. Because this transaction rollback actually relies heavily on your own code to roll back and compensate for it, the compensation code can be huge.
For example, generally speaking, for scenarios related to money, dealing with money, payment, transaction, we use TCC to strictly guarantee that distributed transactions will either all succeed or all automatically roll back, strictly guarantee the correctness of funds, guarantee that there will be no problems with funds.
And it’s best if you have a shorter time frame for each business.
But to be honest, generally try not to do this, write your own rollback logic, or compensation logic, business code is difficult to maintain.
Saga scheme
Financial core business may choose TCC scheme to pursue strong consistency and higher concurrency, while more business systems above financial core tend to choose compensation transaction. Compensation transaction processing proposed Saga theory more than 30 years ago, and gradually attracted attention with the development of micro-services in recent years. Currently, Saga is widely accepted as a solution for long transactions.
The basic principle of
Each participant in a business process commits a local transaction, and if one of the participants fails, the previous successful participants are compensated. The normal transaction process is shown on the left side of the figure below. When an error occurs during the execution of T3, the transaction compensation process on the right begins to be executed, and the compensation services C3, C2 and C1 of T3, T2 and T1 are reverse-executed to compensate the modified data of T3, T2 and T1.
Usage scenarios
For scenarios with high consistency requirements, short processes, and high concurrency, such as financial core systems, TCC solutions are preferred. In other scenarios, we don’t need that much consistency, just the ultimate consistency.
For example, many businesses above the financial core (channel layer, product layer, system integration layer) are characterized by consistency, multiple and long processes, and the services of other companies may be called. In this case, if the TCC scheme is chosen for development, on the one hand, the cost is high, and on the other hand, the service of other companies cannot follow the TCC mode. At the same time, the process is long, the transaction boundary is too long, and the lock time is long, which will also affect the concurrency performance.
So the Saga mode applies to the following scenarios:
- Long and many business processes;
- Participants include other corporate or legacy system services that do not provide the three interfaces required by the TCC pattern.
advantage
- One-stage commit local transaction, no lock, high performance;
- Participants can execute asynchronously with high throughput;
- Compensation services are easy to implement because the reverse of an update operation is relatively easy to understand.
disadvantages
- Transaction isolation is not guaranteed.
Local message table
The local message list is actually a set of ideas developed by foreign ebay.
It goes something like this:
- A When the system operates in its own local transaction, it inserts A data into the message table;
- System A then sends this message to MQ;
- After receiving the message, system B inserts a data into its local message table in a transaction and performs other business operations at the same time. If the message has been processed, the transaction will be rolled back to ensure that the message will not be processed again.
- After the execution succeeds, system B updates the status of its local message table and that of system A.
- If system B fails to process the message table, the status of the message table is not updated. At this time, system A periodically scans its message table. If there are unprocessed messages, system A sends them to MQ again for B to process again.
- This scheme ensures final consistency. Even if B fails, A will continue to resend messages until B succeeds.
To be honest, the biggest problem with this scheme is that it relies heavily on the message table of the database to manage transactions. What if it is a high concurrency scenario? How to expand? So it’s really rarely used.
Reliable message final consistency scheme
This means that instead of using native message tables, you can implement transactions directly based on MQ. Alibaba’s RocketMQ, for example, supports message transactions.
It means:
- System A will send A Prepared message to MQ first. If the prepared message fails to be sent, the operation will be cancelled.
- If the message is sent successfully, the local transaction is then executed, telling MQ to send an acknowledgement message if it succeeds, and telling MQ to roll back the message if it fails.
- If an acknowledgement message is sent, system B receives the acknowledgement message and executes the local transaction.
- Mq will automatically poll all prepared messages to call back to your interface and ask you if this message failed in a local transaction. Should you retry or roll back any unconfirmed messages? Generally you can check the database here to see if the previous local transaction was executed, and if it was rolled back, then roll back here as well. This is to avoid the possibility that the local transaction executed successfully, but the confirmation message sent failed.
- In this scenario, what if the transaction for system B fails? Retry, automatically retry until successful, if it is not possible, or for the important fund services roll back, for example, after the local rollback of system B, try to inform system A to roll back; Or send an alarm for manual rollback and compensation.
- This is more appropriate, most domestic Internet companies are playing this way, either you use RocketMQ support, or you based on similar ActiveMQ? The RabbitMQ? They encapsulate a set of similar logic, in short, the idea is this way.
Best efforts to inform the scheme
The general idea of this plan is:
- After the local transaction is completed, system A sends A message to MQ.
- There will be a Max effort notification service dedicated to consuming MQ, which will consume MQ and write it to the database, or put it on a memory queue, and then call the interface of system B;
- If system B succeeds, it is OK; If system B fails, the best effort notification service periodically tries to call system B again, N times, and finally gives up.