This article was first published in 51CTO technology Stack public account author Chen Caihua article reprint exchange please contact [email protected]Copy the code
This article will introduce what is distributed transaction, what problems are solved by distributed transaction, the difficulties in the implementation of distributed transaction, the solution ideas, the choice of solutions under different scenarios, and make a graphical way to sort out, summarize and compare.
I believe that after reading this article, when it comes to distributed transactions, there are no longer just “2PC”, “3PC”, “MQ message transactions”, “ultimate consistency”, “TCC” and other knowledge fragments, but can be linked together to form a knowledge system.
What is a transaction
Before introducing distributed transactions, let’s first introduce what a transaction is.
The specific definition of a transaction
Transactions provide a mechanism to bring all the operations involved in an activity into one indivisible unit of execution. All the operations that make up the transaction can be committed only if all the operations can be executed properly. If any of the operations fail, the entire transaction will be rolled back.
Simply put, transactions provide an “All or Nothing” mechanism.
ACID properties for database transactions
Transactions operate on the basis of data, and you need to ensure that the data of a transaction is usually stored in a database, so when I introduce transactions, I have to introduce the ACID property of database transactions, which is an acronym for the four basic properties of database transactions that execute correctly. Contains:
-
Atomicity All operations in an entire transaction are either completed or not, and cannot be stuck somewhere in between. If a transaction fails during execution, it will be rolled back to the state before the transaction started, as if the transaction had never been executed. For example: Bank transfer, transfer 100 yuan from account A to account B, divided into two steps:
- (1) Withdraw 100 yuan from account A
- (2) Deposit 100 yuan to account B. Either you complete the two steps together, or you don’t complete the two steps together. If you only complete the first step and fail the second step, your money will be 100 yuan less for no reason.
-
The Consistency constraint of the database data is not broken before and after the transaction. For example, if an existing integrity constraint A+B=100 is changed by A transaction, then B must be changed so that A+B=100 is still satisfied after the transaction ends, otherwise the transaction fails.
-
Isolation the ability of multiple concurrent transactions to read, write, and modify data at the same time. If a transaction accesses data that is being modified by another transaction, the data it accesses is not affected by the uncommitted transaction as long as the other transaction is uncommitted. Isolation prevents data inconsistency due to cross-execution when multiple transactions are executed concurrently. For example, there is A transaction that transfers 100 yuan from account A to account B. In the case that the transaction has not been completed, if B queries its own account at this time, it cannot see the newly added 100 yuan.
-
When transactions end, data changes are permanent and are not lost even if the system fails.
In simple terms, ACID describes the characteristics of a transaction in different dimensions:
- Atomicity – the integrity of transaction operations
- Consistency – the correctness of the data under a transaction operation
- Isolation – Correctness of data in concurrent operations of a transaction
- Persistence – The reliability of the transaction’s changes to the data
A database that supports Transaction needs to have these four features; otherwise, the correctness of the data cannot be guaranteed during the Transaction process, and the processing results are likely to fall short of the requirements of the requester.
When to use database transactions
After introducing basic transaction concepts, when should you use database transactions? Simply put, if a set of data operations on a service fails, the whole set of operations are not executed and revert to the unexecuted state. Either all operations succeed or all operations fail.
When using database transactions, it is important to keep transactions as short as possible, and a lengthy transaction that modifies data in multiple different tables can seriously hamper all other users on the system, which is likely to cause some performance problems.
2 What are distributed transactions
After introducing the basic concepts related to transactions, it’s time to introduce distributed transactions.
Distributed generation background and concept
With the rapid development of the Internet, microservices, SOA and other service architecture patterns are being used on a large scale. Now, a distributed system is generally composed of multiple independent subsystems, which cooperate with each other to complete each function through network communication.
To complete, there are many cases across multiple subsystem typically e-commerce order payment process, at least will involve trading systems and payment systems, and the concept of the process would involve a transaction, which guarantee the data consistency of trading system and payment system, here we call a transaction of this cross system distributed transactions, specifically speaking, A distributed transaction means that the participants of the transaction, the server supporting the transaction, the resource server and the transaction manager are located on different nodes of different distributed systems.
Take a common Internet trading business for example:
The figure above contains two separate microservices, inventory and order, each of which maintains its own database. In the business logic of the trading system, before an order is placed, the inventory service needs to be called to deduct the inventory, and then the order service needs to be called to create an order record.
As you can see, if data updates between multiple databases do not guarantee transactions, it can lead to subsystem data inconsistency and business problems.
The difficulty of distributed transactions
- Atomicity of transactions Transactions operate across different nodes. When a node operation fails on multiple nodes, the atomicity of the multi-node operation needs to be guaranteed to do either Nothing or All or Nothing **.
- Transaction consistency When a network transmission failure or node failure occurs and the data replication channel between nodes is interrupted, data consistency must be ensured during transaction operations to ensure that any transaction operation will not cause data to violate the constraints and trigger rules defined by the database.
- Transaction isolation The essence of transaction isolation is how to correctly handle read/write conflicts and write conflicts in multiple concurrent transactions, because in the distributed transaction control, there may be the phenomenon of commit asynchronization, and then there may be “partially committed” transactions. In this case, dirty reads may occur if concurrent applications access data without control.
3 Consistency of distributed systems
The difficulties involved in distributed transactions mentioned above will eventually lead to data inconsistency. The following is a theoretical analysis of the consistency problem of distributed systems, and the distributed scheme will be introduced based on these theories.
The conflict between usability and consistency – CAP theory
The CAP theorem, also known as Brewer’s theorem, is a conjecture proposed in 2000 by A computer scientist at the University of California. In 2002 Seth Gilbert and Nancy Lynch of the Massachusetts Institute of Technology published a proof of Brewer’s conjecture, making it an accepted theorem in distributed computing.
When Brewer proposed CAP conjecture, he did not specifically define the meanings of Consistency, Availability and Partition Tolerance, and specific definitions of different data were also different. In order to better explain, Robert Greiner’s article “CAP Theorem” is chosen as the reference basis.
- The CAP theory is defined in a distributed system (refers to a set of nodes that are interconnected and share data). When read and write operations are involved, only two of the three factors, Consistence, Availability and PartitionTolerance, can be guaranteed. The other must be sacrificed.
Consistency, Availability and Partition Tolerance are explained as follows:
- C – Consistency Consistency
A read is guaranteed to return the most recent write for a given client. For a given client, a read operation is guaranteed to return the latest write result.
This is not to emphasize the same data at the same time. For the system to execute a transaction, during the transaction execution process, the system is actually in an inconsistent state, and the data of different nodes are not completely consistent.
Consistency emphasize the client read operation to get the latest write operations as a result, because the transaction in the process of execution, the client is unable to read uncommitted data, until after the transaction is committed, the client can read write data to the affairs, and if the transaction fails may be rolled back, the client will not read write data between transaction.
- A – Availability Availability
A non-failing node will return a reasonable response within a reasonable amount of time (no error or timeout). The non-failing node returns a reasonable response (not an error or timeout response) within a reasonable amount of time.
The emphasis here is on a reasonable response, no timeouts, no errors. Note that the “correct” result is not said, for example, if it should return 100 but actually returns 90, it must be incorrect, but it could be a reasonable result.
- P-partition Tolerance Specifies the Partition Tolerance
The system will continue to function when network partitions occur. When a network partition occurs, the system can continue to “perform its duties.”
Network partition refers to a distributed system where the network of nodes is supposed to be connected. However, due to some faults (disconnection between nodes, node breakdown), some nodes are disconnected from each other, and the whole network is divided into several areas, and the data is scattered in these disconnected areas.
Selection of consistency, availability, and partition tolerance
Although CAP theory defines that only two of the three elements can be taken, when we think about it in a distributed environment, we will find that P (partition tolerance) element must be selected, because the network itself cannot be 100% reliable and may fail, so partition is an inevitable phenomenon.
If we choose CA + availability (consistency) and gave up the P (partition tolerance), so when the partition is, in order to ensure that C (consistency), the system needs to be banned, when there is A written request, the system returns the error (for example, the current system is not allowed to write), it again and A conflict (availability), Because A (availability) requires no error and no timeout to be returned.
Therefore, it is theoretically impossible for distributed systems to choose CA (consistency + availability) architecture, but can only choose CP (consistency + partition tolerance) or AP (availability + partition tolerance) architecture, and make a compromise between consistency and availability.
- CP – Consistency + Partition Tolerance
As shown in the figure above, the connection between Node1 and Node2 is interrupted, causing a partition. The data on Node1 is updated to Y, but the replication channel between Node1 and Node2 is interrupted, and the data on Node2 cannot be synchronized to Y. The data on Node2 is still old data x.
In this case, when client C accesses Node2, Node2 returns Error, indicating that the system has an Error. This processing violates the Availability requirements. Therefore, CAP can only meet CP requirements.
- Ap-availability + Partition Tolerance
When Node2 is accessed by client C, Node2 returns x to client, but in fact the current data is y. This does not satisfy the Consistency requirement. Therefore, CAP can only satisfy AP.
Note: here Node2 returns x, which is not a “correct” result, but a “reasonable” result because x is old data, not a messy value, just not the latest data.
It is important to supplement that CAP theory tells us that a distributed system can only choose the AP or CP, but in fact is not to say that the whole system can only choose the AP or CP, in CAP theory practice, we need the data within the system classified according to different application scenarios and requirements, each type of data to select different strategies (CP or AP), Instead of directly limiting the whole system to the same strategy for all data.
In addition, the choice of only CP or AP means that the system cannot guarantee both C (consistency) and A (availability) in the event of A partition phenomenon, but it does not mean that nothing should be done. After the partition failure is resolved, the system should maintain the CA guarantee.
The extension of CAP theory — BASE theory
BASE is Basically Available, Soft State, Eventual Consistency, and the idea is that even if you don’t have CAP Consistency, But applications can achieve ultimate consistency in a suitable way.
- In the event of a failure, a distributed system allows for a partial loss of availability, that is, core availability.
The key words here are “part” and “core.” In practice, which is core needs to be weighed according to the specific business. For example, the login function is more core than the registration function, the registration can not affect the loss of some users at most, if the user has registered but can not log in, it means that the user can not use the system, resulting in a greater impact.
-
S-soft State Allows the system to have an intermediate State that does not affect the overall availability of the system. The intermediate state here is the data inconsistency in CAP theory.
-
Eventual Consistency After a certain amount of time, all the data copies in the system can reach the same state.
The key words here are “certain time” and “final”. “certain time” is strongly related to the characteristics of the data. Different businesses and different data can tolerate different inconsistencies. For example, payment business requires consistency within seconds, because users pay attention to it all the time; Users can tolerate the latest micro blog for 30 minutes to reach a consistent state, because users can not see the star’s micro blog for a short time is unaware. And “eventually” means that no matter how long it takes, the state of consistency must eventually be reached.
BASE theory is essentially an extension and complement to CAP, and more specifically, a complement to the AP scheme in CAP:
-
CAP theory ignores delay, which is inevitable in practical application. This means that there is no perfect CP scenario, even a few milliseconds of data replication delay, during which the system is not compliant with CP requirements. Therefore, the CP scheme in CAP actually achieves the final consistency, but “in a certain amount of time” is just a few milliseconds.
-
Sacrificing consistency in AP scenarios is only for the duration of a partition failure, not forever. This is where the BASE theory extends: consistency is sacrificed during partition, but after partition failure is recovered, the system should achieve final consistency.
Data consistency model
The BASE model mentioned “strong consistency” and “final consistency,” and these consistency models are described below.
Distributed systems improve system reliability and fault tolerance by replicating data, and store different copies of data on different machines. Due to the high cost of maintaining consistency of data copies, many systems adopt weak consistency to improve performance. The common consistency model is described as follows:
-
Strong consistency requires that all subsequent reads retrieve the latest data regardless of which copy of data the update is performed on. For single-copy data, read and write operations are performed on the same data, ensuring strong consistency. For multiple copies of data, the distributed transaction protocol is required.
-
Weak consistency In this kind of consistency, it takes a while for the user to read that an operation has updated specific data in the system. This period of time is called an “inconsistency window”.
-
Ultimate consistency is a special case of weak consistency, where the system guarantees that the user will eventually be able to read updates to system-specific data by an operation (no other updates to the data preceded the read operation). The size of the inconsistency window depends on interaction latency, the load on the system, and the number of copies of the data.
The consistency model selected by the system depends on the consistency requirements of the application. The consistency model selected also affects how the system handles user requests and the choice of replica maintenance technology. Solutions for distributed transactions are presented later based on the consistency model described above.
Flexible transaction
Flexible transaction concept
In the Internet scenario such as e-commerce, the traditional transaction is exposed to the bottleneck in the database performance and processing capacity. In the distributed domain based on CAP theory and BASE theory, someone put forward the concept of flexible transaction.
Based on the design idea of BASE theory, under flexible transactions, the system is allowed to have an intermediate State of data inconsistency (Soft State) without affecting the overall availability of the system. After the delay of data synchronization, the final data can be consistent. Rather than abandoning ACID altogether, the consistency requirements are relaxed to allow for the ultimate distributed transaction consistency with the help of local transactions while ensuring throughput in the system.
Implement some features of flexible transactions
The following are some of the common features of implementing flexible transactions, which may not all be met in specific scenarios, because different scenarios have different requirements.
Visibility (externally searchable) During the execution of a distributed transaction, if the execution of a certain step fails, it is necessary to know the processing status of the other several operations clearly. This requires that other services can provide query interfaces to ensure that the processing status of the operation can be determined through the query.
In order to ensure that operations are searchable, each invocation of each service needs to have a globally unique identifier, which can be a business bill number (such as order number) or a system-assigned operation sequence number (such as payment record sequence number). In addition, the time information of the operation should also be fully recorded.
Operating idempotence Idempotence is actually a mathematical concept. An idempotent function, or idempotent method, is a function that can be executed repeatedly with the same arguments and can get the same result. The characteristic of idempotent operations is that any number of executions have the same effect as a single execution. That is, the same method, with the same parameters, is called multiple times and the business result is the same as if it was called once.
Operation idempotence is required because many transaction protocols have many retries in order to ensure the ultimate consistency of the data, and if a method is not idempotence guaranteed, it cannot be retried. There are many ways to implement idempotent operations, such as caching all requests and processing results in the system, and directly returning the processing results of the last time when repeated operations are detected.
4 Common distributed transaction solutions
After the consistency theory of distributed systems is introduced, common solutions for distributed transactions based on different consistency models are introduced. The application scenarios of each solution are described later.
There are many kinds of implementation of distributed transaction, among which the classic one is XA distributed transaction protocol proposed by Tuxedo. XA protocol includes two phases commit (2PC) and three phases commit (3PC).
4.1 2PC(two-phase commit) scheme — strong consistency
Project introduction
Two-phase Commit (2PC) is a commonly used distributed transaction solution, which divides the transaction Commit process into Two phases: prepare phase and Commit phase. The initiator of a transaction is called the coordinator, and the implementer of the transaction is called the participant.
In a distributed system, each node can know the success or failure of its own operations, but cannot know the success or failure of other nodes’ operations. In order to maintain atomicity and consistency when a transaction spans multiple nodes, a coordinator is introduced to uniformly control the actions of all participants and indicate whether they should actually commit or rollback the actions.
The algorithm idea of two-stage submission can be summarized as follows: participants inform the coordinator of the success or failure of the operation, and then the coordinator decides whether each participant should submit the operation or abort the operation according to the feedback information of all participants.
The core idea is to process each transaction in a “try before commit” manner, after which all read operations must be able to obtain the latest data, so two-phase commit can also be regarded as a strong consistency algorithm.
Processing flow
To simplify the understanding, the coordinator node can be compared to the leading brother, and the participants can be compared to the minion, where the leading brother coordinates the task execution of the minion.
Phase 1: Preparation phase
- 1. The coordinator sends the transaction content to all participants, asks whether the transaction can be submitted, and waits for the reply from all participants.
- 2. Each participant performs transaction operations and writes undo and redo information to the transaction log (but does not commit the transaction).
- 3. If the execution is successful, the participant shall feedback yes to the coordinator, that is, it can be submitted; If the execution fails, no is reported to the coordinator, that is, the submission cannot be submitted.
Phase 2: Commit Phase If the coordinator receives a failure message or timeout message from the participant, the coordinator directly sends a rollback message to each participant. Otherwise, a commit message is sent; Participants perform commit or rollback operations as instructed by the coordinator to release all locked resources used during the transaction. (Note: lock resources must be released in the final phase.) The commit phase process is discussed in two separate cases.
In case 1, when all participants feedback yes, the transaction is committed:
- 1. The coordinator issues a formal request to commit the transaction to all participants (i.e., a COMMIT request).
- 2. The participant performs the COMMIT request and releases the resources consumed during the entire transaction.
- 3. Each participant feedback the completion of ACK (response) to the coordinator.
- 4. After the coordinator receives the ACK message feedback from all participants, the transaction will be committed.
In case 2, when one of the participants in any phase 1 reports no, the transaction is interrupted:
- 1. The coordinator sends rollback requests (rollback requests) to all participants.
- 2. The participant uses the undo information in Phase 1 to perform the rollback operation and release the resources occupied during the entire transaction.
- 3. Each participant feedback the ack completion message to the coordinator.
- 4. After the coordinator receives the ACK message feedback from all participants, the transaction interruption is completed.
Project summary
The 2PC solution is simple to implement and rarely used in actual projects, mainly because of the following problems:
- Performance problems All participants in the transaction commit phase are in a synchronous blocking state, occupying system resources and leading to performance bottlenecks.
- Reliability problem If the coordinator has a single point of failure, if the coordinator fails, the participant will remain locked.
- Data consistency Problems In Phase 2, if a local network problem occurs where one transaction participant receives a commit message and another does not, it results in data inconsistency between nodes.
4.2 3PC(Three-phase submission) scheme
Project introduction
Phase three commit protocol is an improved version of phase two commit protocol. Unlike phase two commit, a timeout mechanism is introduced. Introduce a timeout mechanism in both the coordinator and the participant.
A three-phase commit breaks the two-phase preparation phase into two phases and inserts a preCommit phase, so that the participants in the two-phase commit phase will crash or error due to the coordinator after the preparation. The potentially lengthy delay, which resulted in participants being in an “uncertain state” where they could not know whether to commit or abort, was resolved.
Processing flow
Phase 1: the canCommit coordinator sends a commit request to the participant, which returns a yes response if the participant canCommit (the participant does not perform a transaction operation), or a no response otherwise:
- 1. The coordinator issues a canCommit request containing the transaction contents to all participants, asks whether the transaction can be committed, and waits for the reply from all participants.
- 2. After receiving the canCommit request, if the participant thinks that the transaction operation can be performed, he will feedback yes and enter the ready state; otherwise, he will feedback no.
Phase 2: The preCommit coordinator determines whether transaction-based preCommit operations can be performed based on the response of canCommit participants in Phase 1. Depending on the response, there are two possibilities.
Situation 1: All participants in Phase 1 report yes, and the participants preexecute the transaction:
- 1. The coordinator issues a preCommit request to all participants and enters the preparation phase.
- 2. After receiving a preCommit request, the participant performs a transaction and writes undo and redo information to the transaction log (but does not commit the transaction).
- 3. Each participant feeds back ack response or NO response to the coordinator and waits for the final instruction.
Case 2: No feedback from any participant in Phase 1, or the coordinator cannot receive feedback from all participants after timeout, that is, the transaction is interrupted:
- 1. The coordinator issues abort requests to all participants.
- 2. Participants will break the transaction whether they receive an abort request from the coordinator or a timeout occurs while waiting for the coordinator request.
Phase 3: Do Commit Phase 3: Do Commit Phase 3: Do Commit Phase 3: Do Commit Phase 3: Do Commit Phase
Scenario 1: In Phase 2, all participants feedback ack responses and perform a true transaction commit:
- 1. If the coordinator is working, a DO Commit request is issued to all participants.
- 2. After receiving the DO Commit request, the participant formally commits the transaction and releases the resources occupied during the entire transaction.
- 3. Each participant feedback the ack completion message to the coordinator.
- 4. After the coordinator receives the ACK message feedback from all participants, the transaction will be committed.
In phase 2, any participant gives no feedback, or the coordinator cannot receive feedback from all participants after timeout, that is, the transaction is interrupted:
- 1. If the coordinator is in the working state, abort requests are issued to all participants.
- 2. The participant uses the undo information in Phase 1 to perform the rollback operation and release the resources occupied during the entire transaction.
- 3. Each participant feedback the ack completion message to the coordinator.
- 4. After the coordinator receives the ACK message feedback from all participants, the transaction interruption is completed.
Note: After entering Phase 3, either a problem with the coordinator or a problem with the network between the coordinator and the actor will result in an actor not receiving a DO Commit request or abort request from the coordinator. At this point, the participants continue to commit the transaction after the wait timeout.
Project summary
-
Advantages Compared to two-phase commit, three-phase close reduces the blocking range, and the coordinator or participant interrupts the transaction after the wait timeout. Avoiding a single point of coordinator problems, participants continue to commit transactions when the coordinator has problems in Phase 3.
-
The problem of data inconsistency still exists. When the participant is waiting for the do Commite instruction after receiving the preCommit request, if the coordinator requests to interrupt the transaction, but the coordinator cannot communicate with the participant normally, the participant will continue to commit the transaction, resulting in data inconsistency.
4.3 TCC (try-confirm-Cancel) transactions — final consistency
Project introduction
The concept of TCC (try-confirm-cancel), It was first proposed by Pat Helland in a paper titled Life Beyond Distributed Transactions: An Apostate’s Opinion published in 2007.
TCC is a service-oriented two-stage programming model, and its Try, Confirm and Cancel methods are implemented by business code.
- The Try operation, as a phase, is responsible for checking and reserving resources.
- The Confirm operation performs the real business as a two-phase commit operation.
- Cancel is the cancellation of a reserved resource.
TCC transactions Try, Confirm, and Cancel can be understood as Lock, Commit, and Rollback in SQL transactions.
Processing flow
In order to facilitate understanding, the following takes e-commerce ordering as an example to analyze the solution. Here, the whole process is divided into two steps: inventory deduction and order creation. The inventory service and order service are respectively on different server nodes.
1. Try phase From the execution stage, the business logic is the same as in the traditional transaction mechanism. But from a business perspective, it’s different. The Try in the TCC mechanism is only a preliminary operation, which together with the subsequent validation can really constitute a complete business logic. This stage is mainly completed:
- Complete all service checks (consistency)
- Reserve required service resources (quasi-isolated)
- The TCC transaction mechanism centers on the initial operation (Try), around which Confirm and Cancel are conducted. Therefore, an operation in the Try phase has the best guarantee. Even if it fails, there is still a Cancel operation that can undo its execution result.
Suppose the inventory of goods is 100 and the purchase quantity is 2. Here, while checking and updating the inventory, the inventory of the user’s purchase quantity is frozen and an order is created at the same time. The order status is to be confirmed.
2. Confirm/Cancel The Confirm or Cancel operation is performed according to whether all services in the Try phase are executed normally. The Confirm and Cancel operations are idempotent. If the Confirm or Cancel operations fail, they are retried until the execution is complete.
Confirm: When all services in the Try phase are running properly, Confirm service logic operations are performed
The resource used must be the service resource reserved in the Try phase. In the TCC transaction mechanism, it is assumed that if resources are reserved normally during the Try phase, the Confirm must be committed completely and correctly. The Confirm phase can also be seen as a supplement to the Try phase. Together, the Try and Confirm form a complete business logic.
Cancel: When service execution fails in the Try phase, the system enters the Cancel phase
Project summary
Compared with the traditional transaction mechanism (X/Open XA), THE TCC transaction mechanism has the following advantages over the XA transaction mechanism described above:
- The performance of specific services can be improved to reduce the granularity of resource locks. The entire resource is not locked.
- The final consistency of data is based on the idempotent of Confirm and Cancel to ensure the final confirmation or cancellation of the transaction and ensure the consistency of data.
- Reliability solves the problem of single point of failure of the coordinator of XA protocol. The main business side initiates and controls the whole business activity, and the business activity manager becomes multi-point and introduces cluster.
Disadvantages: The Try, Confirm and Cancel operation functions of TCC should be implemented according to the specific business, and the business coupling degree is high, which increases the development cost.
4.4 Local Message Table — Final Consistency
Project introduction
The solution of local message table was originally proposed by ebay. The core idea is to split distributed transactions into local transactions for processing.
In the scheme, an additional transaction message table is created in the active transaction initiator, the transaction initiator processes business and records the transaction message in the local transaction, polls the data of the transaction message table and sends the transaction message, and the transaction passive party consumes the transaction in the transaction message table based on the message middleware.
This design can avoid the sticky situation of “business processing success + transaction message sending failure” or “business processing failure + transaction message sending success”, and ensure the data consistency of the two system transactions.
Processing flow
The transaction that is processed first in the distributed transaction is called the transaction active party, and the other transactions in the business that are processed after the transaction active party are called the transaction passive party.
In order to facilitate understanding, the following continues to take e-commerce ordering as an example for solution analysis. Here, the whole process is divided into two steps: inventory deduction and order creation. The inventory service and order service are respectively on different server nodes, in which the inventory service is the transaction active party, and the order service is the transaction passive party.
The active party of the transaction needs to create an additional transaction message table to record the occurrence and processing status of distributed transaction messages.
The whole business process is as follows:
- Procedure Step 1 The active party processes the local transaction. Transactions are actively sent in local transactions to handle business update operations and write message table operations. The inventory service phase in the above example completes the inventory deduction and message table writing in a local transaction (figures 1 and 2).
- Step 2 The transaction active party notifies the transaction passive party to process the notification transaction pending message through the message middleware. Message middleware can proactively write messages to message queues based on Kafka and RocketMQ message queues, and the transaction consumer can consume and process the messages in the message queues. In the above example, the inventory service writes the transaction pending message to the message middleware, and the order service consumes the message from the message middleware to complete the new order (figure 3-5).
- Step 3 The passive party notifies the active party of the processed message through the message middleware. In the above example, the order service writes the transaction-processed message to the messaging middleware, inventories the message from the service consuming the middleware, and updates the status of the transaction message to completed (figure 6-8)
For data consistency, when processing errors require retries, transaction sender and transaction receiver related business processing needs to support idempotence. The specific fault tolerance for consistency preservation is as follows:
- 1, when step 1 processing error, the transaction rollback, nothing happened.
- 2. When processing errors occur in Step 2 and Step 3, because the unprocessed transaction messages are still saved in the transaction sender, the transaction sender can poll for timeout message data regularly and process it by the messaging middleware that sends it again. The transaction passive consumes the transaction message retry processing.
- 3. If it is a business failure, the transaction passive party can send a message to the transaction active party for rollback.
- 4. If more than one passive party has consumed the message, the active party needs to notify the passive party to roll back the transaction.
Project summary
The advantages of the scheme are as follows:
- The reliability of message data is realized from the perspective of application design and development. The reliability of message data does not depend on the message middleware and weakens the dependence on the MQ middleware features.
- The scheme is lightweight and easy to implement.
The disadvantages are as follows:
- Binding with specific business scenarios, strong coupling, not public.
- Message data and service data are in the same library, occupying service system resources.
- When a business system uses a relational database, the message service performance is limited by the concurrency performance of the relational database.
4.5 MQ transactions — Final consistency
Project introduction
The distributed transaction scheme based on MQ is actually the encapsulation of the local message table. The local message table is based on THE INTERNAL OF MQ, and other protocols are basically the same as the local message table.
Processing flow
The following describes distributed transaction schemes for MQ based on RocketMQ4.3.
In the local message table scheme, it is based on database transactions to ensure the consistency between the data of the business table sent by the transaction initiative and the data of the message table written. Compared with ordinary MQ, the transaction message of RocketMQ provides a 2PC commit interface. The scheme is as follows:
Normal situation — In this case, the service of the transaction activist is normal and no fault occurs. The message sending process is as follows:
- 1. Sending Half messages to the MQ Server.
- 2. After successfully persisting the message, the MQ Server confirms that the message has been successfully sent to the sender ACK.
- In figure 3, the sender starts to execute the local transaction logic.
- In figure 4, the sender submits a second acknowledgement (commit or rollback) to the MQ Server based on the result of the local transaction execution.
- In figure 5, the MQ Server receives the COMMIT status and marks the semi-message as postable, and the subscriber eventually receives the message; When the MQ Server receives rollback status, it deletes the half-message and the subscriber will not accept the message.
In the case of network disconnection or application restart, the second confirmation submitted in Figure 4 does not reach the MQ Server due to timeout, and the processing logic is as follows:
- In the figure, 5. The MQ Server initiates a message check for the message.
- 6. After the sender receives the message and checks back, it needs to check the final result of the local transaction execution of the corresponding message.
- In figure 7, the sender commits the second acknowledgement according to the final status of the local transaction
- In figure 8, the MQ Server delivers or deletes messages based on COMMIT/ROLLBACK
Project summary
Compared to the local message table scheme, the MQ transaction scheme has the following advantages:
- The message data is stored independently, reducing the coupling between the business system and the message system.
- Throughput due to the use of the local message table scheme.
The disadvantage is that:
- Two network requests for one message (half message + COMMIT /rollback message)
- The business processing service needs to implement the message state backcheck interface
4.6 Saga Transactions — Final consistency
Project introduction
Saga transactions originated from a paper published by Hecto and Kenneth from Princeton University in 1987 on how to deal with long Lived Transactions. The core idea of Saga transactions is to split long transactions into multiple local short transactions, which are coordinated by Saga transaction coordinator. If it ends normally it completes normally, and if one of the steps fails, the compensation operation is invoked once in reverse order.
Processing flow
The basic agreement for Saga transactions is as follows:
- Each Saga transaction consists of a series of idempotent ordered sub-transactions Ti.
- Each Ti has a corresponding idempotent compensation action, Ci, which is used to undo the result caused by Ti.
As you can see, compared to TCC, Saga has no “reserved” action and its Ti is simply committed directly to the library.
As an example of the following single process, the entire operation includes: order creation, inventory deduction, payment, and point increase Saga has two execution orders:
- Transaction normal execution complete T1, T2, T3… , Tn, for example: inventory deduction (T1), order creation (T2), payment (T3), in order to complete the entire transaction.
- Rollback T1, T2,… , Tj, Cj,… , C2, C1, where 0 < j < n, for example: inventory deduction (T1), order creation (T2), payment (T3, payment failure), payment rollback (C3), order rollback (C2), inventory recovery (C1).
Saga defines two recovery policies:
- Forward recovery
Corresponding to the first execution order above, which applies to scenarios that must succeed, the execution order is similar to this: T1, T2… , Tj(failed), Tj(retry)… , Tn, where j is the sub-transaction where the error occurred. Ci is not required in this case.
- Backward recovery
Corresponding to the second execution order mentioned above, where j is the sub-transaction that occurred in error, the effect of this approach is to undo all previously successful sub-transactions, making the execution result of the whole Saga revoked.
Saga transactions are commonly implemented in two different ways:
- 1. Order Orchestrator: The central Orchestrator is responsible for centralized event decision-making and business logic ordering.
The central Orchestrator, or OSO, communicates with each service by command/response and has sole responsibility for telling each participant what to do and when to do it.
Take the example of an e-commerce order:
1. The main business logic of the transaction initiator requests the OSO service to start the order transaction. 2. 3. OSO requests the order service to create an order, and the order service replies with the creation result. 4. OSO requests payment to the payment service, and the payment service replies with the processing result. 5. The main business logic receives and processes the response of OSO transaction processing results.
The central coordinator must know in advance the process required to execute the entire order transaction (for example, by reading the configuration). If anything fails, it is also responsible for coordinating a distributed rollback by sending a command to each participant to undo the previous action. It is much easier to roll back when everything is coordinated based on a central coordinator, because the coordinator by default performs the forward process, and only the reverse process when rolling back.
- An Event Choreography0: With no central coordinator (no single point of risk), each service generates and watches the events of the other services and decides whether or not it should take action.
In the event choreography method, the first service executes a transaction and then publishes an event. This event is listened on by one or more services, which in turn perform local transactions and publish (or not publish) new events.
The distributed transaction ends when the last service performs a local transaction and does not publish any events, or when the events it publishes are not heard by any of the Saga participants.
Take the example of an e-commerce order:
1, the main business logic affairs sponsors release events began to order 2 events are orders, inventory service monitoring, and deduct the inventory, and publish the inventory orders have deduct 2, the event service to monitor inventory has been deducted, create orders, and release the order created 4, payment services to monitor order created events, to pay, 5. The main business logic listens for the paid order event and processes it.
Event/choreography is a natural way to implement the Saga pattern, it’s simple, easy to understand, and doesn’t require much code to build. This may be appropriate if the transaction involves two to four steps.
Project summary
The advantages and disadvantages of command coordinated design: The advantages are as follows:
- 1. Simple relationship between services to avoid circular dependencies between services, because Saga coordinator will call Saga participants, but participants will not call coordinator
- 2, program development is simple, only need to execute command/reply (actually reply message is also a kind of event message), reduce the complexity of participants.
- 3. Easy to maintain extensions, transaction complexity remains linear as new steps are added, rollback is easier to manage, and easier to implement and test
The disadvantages are as follows:
- 1, the central coordinator is easy to handle the logic is easy to be too complex, resulting in difficult maintenance.
- 2. There is a risk of single point failure of the coordinator.
Advantages and disadvantages of event/choreography Design The advantages are as follows:
- 1. Avoid the risk of single point failure of the central coordinator.
- 2. When there are fewer steps involved, service development is simple and easy to implement.
The disadvantages are as follows:
- 1. Risk of circular dependency between services.
- 2. When there are too many steps involved, the relationship between services is chaotic, and it is difficult to track the commissioning.
To add that because the Saga in the model have no Prepare stage, therefore cannot guarantee isolation between transaction, when more than one the same resource of Saga transactions, will produce the problem such as lost updates, dirty data read, need in the business layer control concurrent at this moment, for example: lock in the application level, application level or freeze resources in advance.
5 concludes
Scenarios of each solution
After introducing the theory of distributed transactions and common solutions, the final goal is to apply them to a real project. Therefore, summarize the common usage scenarios of each solution.
-
2PC/3PC relies on the database and can provide strong consistency and strong transaction, but relatively high latency, suitable for traditional single application, there are cross-library operations in the same method, not suitable for high concurrency and high performance requirements.
-
TCC is suitable for fixed and short execution time, with high requirements on real-time performance and data consistency. For example, the three core services of Internet financial enterprises are transaction, payment and accounting.
-
Local message table /MQ transactions are applicable to transactions in which participants support operation idempotent, and have low requirements for consistency. In business, data inconsistency can be tolerated to a manual inspection cycle. The transaction involves fewer participants and participating links, and there is a reconciliation/verification system in business.
-
Saga transaction Since Saga transaction cannot guarantee isolation, concurrency needs to be controlled at the business layer, which is suitable for business scenarios where concurrent transactions operate on a small number of resources. Compared with Saga, there is no pre-submitted action, which leads to the troublesome implementation of compensation action. For example, if the service is sending SMS, the compensation action has to send SMS again to explain the cancellation, which results in poor user experience. Saga transactions are more suitable for scenarios where compensation actions are easier to handle.
Distributed transaction scheme design
This article is biased towards the principles. There are already many open source or paid solutions in the industry, so I won’t discuss them in the space.
In the practical application of the theory of architecture design, many people are prone to make the mistake of “with a hammer in hand, everything looks like a nail”. When designing the scheme, there are too many problem scenarios, various retries, and various compensation mechanisms introduced into the system, resulting in the design of the system is too complex, landing far away.
The world’s easiest way to solve a computer problem: “just” doesn’t need to solve it! — Shen Xun, technology expert of Ali middleware
Some problems, which may seem important, can actually be avoided by proper design or by breaking the problem down. It is not necessary to design a distributed transaction system to consider all the exceptions, and to overdesign various rollback and compensation mechanisms. It is not only inefficient but wasteful to spend time on the problem itself.
If the system were to implement a rollback process, the system would be much more complex and prone to bugs, estimated to be much more likely to occur than if the transaction rollback was required. When designing a system, we need to measure whether it is worth spending so much money to solve such a problem with a very small probability. We can consider whether it is possible to solve such a problem with a very small probability manually. This is also where we need to think a lot when solving difficult problems.
More wonderful, welcome to pay attention to the author’s public account [Distributed System Architecture]
reference
Technology – talk – transactions
Implementation of transactions in MySQL
Distributed consistency algorithms 2PC and 3PC
Principles and practices of distributed Open Messaging System (RocketMQ)
Introduction to RocketMQ transaction messaging
Saga distributed Transaction Solution and Practice — Jiang Ning
Distributed transaction Saga mode
Think of distributed transactions in terms of a gold coin recharge