Case list source address github.com/qinxuewu/bo…
Distributed transaction
- Distributed transaction refers to transaction participants, transaction supporting servers and resource servers located on different nodes of distributed system. Usually, a distributed transaction involves operations on multiple data sources or business systems.
- A typical distributed transaction scenario: a cross-bank transfer operation involves invoking two remote banking services
Theory of CAP
- CAP theory: it is impossible for a distributed system to meet the three basic requirements of consistency, availability and fault tolerance of partitions, but only two of them at most
- Consistency (C) : A feature of whether data can be consistent across multiple copies.
- Availability (A) : It means that the services provided by the system must be consistently available. For each user’s request, results are always returned within A limited time. After the time, the system is considered to be unavailable
- Fault tolerance of partitions (P) : When a distributed system encounters any network partition failure, it still needs to be able to provide services that meet the requirements of consistency and availability, unless the entire network environment fails.
Application of CAP theorem
- Give up P (CA) : if you want to avoid fault tolerance system partition problem, it is a relatively simple to all of the data (or data) associated with things in a distributed nodes, so although there is no guarantee that 100% system will not go wrong, but at least not met due to the negative effects of network partition
- Abandon A(CP): Once the system encounters network partition or other faults, the affected services need to wait for A certain period of time. During the application waiting period, the system cannot provide normal services externally, that is, it is unavailable
- Abandon C(AP): The abandonment of consistency here does not mean that data consistency is completely unnecessary. It means to abandon the strong consistency of data and retain the final consistency of data.
The BASE theory of
- BASE is basically available, soft state, final consistency. It is the result of the authority of consistency and availability in CAP. It is evolved based on the CAP theorem. The core idea is that even if strong consistency cannot be achieved, each application can adopt appropriate ways to achieve final consistency according to its own business specific
2 PC to submit
- The two-phase commit protocol divides the process of transaction submission into two stages: submission of transaction request and execution of transaction submission.
Phase 1: Submit the transaction request
- Transaction query: The coordinator sends transaction content to all participants, asks if a transaction commit can be performed, and waits for responses from each participant
- Execute transaction: Each participant node performs a transaction and writes Undo and Redo information to the transaction log
- If the participant successfully executes the transaction, it will feedback to the coordinator Yes response, indicating that the transaction can be executed; if the transaction is not successfully executed, it will feedback to the coordinator No response, indicating that the transaction can not be executed
- The two-phase commit phase overnight is called the vote phase, where each participant votes on whether to proceed with the subsequent transaction commit operation
Phase two: Perform the transaction commit
- If the coordinator receives a Yes response from all participants, the transaction commit is performed.
- Send a Commit request: The coordinator issues a Commit request to all the participant nodes
- Transaction Commit: After receiving the Commit request, the participant formally performs the transaction Commit and relinquishes the transaction resources occupied during the entire execution of the transaction
- Feedback transaction commit results: The participant sends an ACK message to the coordinator after completing the transaction commit
- Completion of transaction: The coordinator completes the transaction after receiving ACK messages from all participants
Interrupt the transaction
- If any participant gives a No response to the coordinator, or if the coordinator is unable to receive feedback from all participants after waiting for the supermarket, the transaction is interrupted.
- Send a Rollback request: The coordinator issues a Rollback request to all participant nodes
- Transaction Rollback: After receiving the Rollback request, the participant performs the transaction Rollback with the Undo information recorded in one of the phases and releases the resources occupied during the transaction execution after the Rollback is complete.
- Feedback transaction rollback results: Participants send AN ACK message to the coordinator after completing the transaction rollback
- Interrupt transaction: After receiving ACk messages from all participants, the coordinator completes the transaction interrupt,
The advantages and disadvantages
- The principle is simple and the implementation is convenient
- Disadvantages are synchronous blocking, single point of problem, split brain, conservative
3 PCS to submit
- Three-phase commit, also known as the three-phase Commit protocol, is an improved version of two-phase Commit (2PC).
- Unlike the two-phase commit, the three-phase commit has two change points. Introduce timeouts. Introduce timeouts for both the coordinator and the participant. Insert a preparation phase between phases 1 and 2. The state of each participating node is consistent before the final submission stage.
- There are three phases of CanCommit, PreCommit, and DoCommit.
Seata distributed transaction scheme
- Seata is an open source distributed transaction solution dedicated to providing high performance and easy to use distributed transaction services. Seata will provide users with AT, TCC, SAGA and XA transaction modes to create a one-stop distributed solution for users.
Seata term
- TC: Transaction coordinator. Maintains the state of global and branch transactions and drives global transaction commit or rollback.
- TM: Transaction manager. Define the scope of a global transaction: start, commit, or roll back the global transaction
- RM: Manages resources for branch transaction processing, talks with TCS to register branch transactions and report status of branch transactions, and drives commit or rollback of branch transactions.
Seata’s 2PC scheme
- Phase one: Business data and rollback log records are committed in the same local transaction, freeing local locks and connection resources.
- Phase two: Commit asynchronously, done very quickly. Rollback is compensated in reverse by the rollback log of one phase.
- Make sure you get the global lock before committing a phase 1 local transaction. Unable to commit local transaction because global lock could not be obtained.
- Attempts to acquire the global lock are limited to a certain range, beyond which the local transaction is abandoned and rolled back to release the local lock.
- The default global isolation level for Seata (AT mode) is read uncommitted, based on database local transaction isolation level reads committed or above
- If the application is applied in certain scenarios, it must require that the global read has been committed. Currently, Seata is brokered through SELECT FOR UPDATE statements.
Seata performs process analysis
- each
RM
useDataSourceProxy
Link data paths for useConnectionProxy
The purpose of using data sources and data proxies in the first phase will beundo_log
And business data in a local transaction commit, so that the undo_log is saved whenever there is a business operation - Undo_log (undo_log) is the first phase of undo_log to store the value before and after the data modification, so as to make sure that the transaction rollback is correct, so the branch transaction has been committed after the first phase is completed, which also releases the lock resource
- TM turns on the global transaction start, puts the XID global transaction ID in the transaction context, and also passes the XID to the downstream Branch transaction through the Feign call. Each Branch transaction associates its Branch transaction ID with the XID
- The second stage is global transaction submission. TC will notify each branch participant to submit the branch transaction. In the first stage, the branch transaction has already been committed
- If a branch transaction is abnormal, the second phase will roll back the global transaction, TC will inform all branch participants to roll back the branch transaction, pass
XID
andBranch-ID
Find the corresponding rollback log and roll back the reverse generated by the logSQL
And execute to complete the branch transaction rollback to the previous
Seata in action
- Github.com/seata/seata…
- Github.com/seata/seata…
TCC distributed transactions
- TCC is a two-stage programming model of servitization, and its three methods, Try, Confirm and Cancel, are implemented by business coding
- TCC requires each branch transaction to perform three operations: preprocess a Try, Confirm, and Cancel.
- The Try operation checks services and reserves resources.
- Confirm Perform service confirmation operations
- Cancel implements the opposite of Try, the rollback operation.
- TM first initiates all the branch transaction Try operations. If the Try operation of any branch transaction fails, TM will initiate all the branch transaction Cancel operations. If all the Try operations succeed, TM will initiate Confirm operations for all branch transactions. If Confirm/Cancel fails,TM will retry.
The three stages of TCC
- The Try stage does the business check (consistency) and resource reservation (isolation). This stage is only a preliminary operation, which together with the Confirmy later makes a complete business logic
- In the Confirm phase, Confirm is committed. In the Try phase, Confirm is executed after all branch transactions are successfully executed. In general, TCC is used to ensure that the Confirm phase does not go wrong, that is: If the Try succeeds, Confirm succeeds. If an error occurs during the Confirm phase, you need to introduce the retry mechanism or manually handle the error
- In the Cancel stage, branch transactions are cancelled and reserved resources are released when a service execution error needs to be rolled back to the state. Under normal circumstances, TCC is used to consider that the Cancel stage must also be true work. If there is an error in the Cance stage, retry mechanism or manual processing should be introduced
- TM Transaction Manager: THE TM transaction manager can be implemented as a standalone service or can be enabled
Global transaction initiator
Acting as a TM,TM is separate for common components, to consider the system architecture and software reuse - TM generates a global transaction record when it initiates a global transaction. The global transaction ID runs through the whole distributed transaction invocation chain, which is used to record the transaction context, track and record the state. It is used for Confirm and CACEL failure, which requires retries, so idempotency needs to be realized
Three exception handling cases for TCC
Idempotent processing
- Because of network jitter and other reasons, the distributed transaction framework may repeatedly call the two-phase interface of a branch transaction in the same distributed transaction. So the two-phase interface Confirm/Cancel of a branch transaction needs to be idempotent. If a two-phase interface is not idempotent, serious problems may occur, resulting in repeated use or release of resources, which may lead to service failures.
- For idempotent problems, the usual method is to introduce idempotent fields to prevent replay attacks. For the idempotent problem in distributed transaction framework, the same tool can be used.
- The insertion time for idempotent records is the participant’s Try method, at which point the branch transaction state is initialized as INIT. It then sets the status to CONFIRMED/ROLLBACKED when the two-phase Confirm/Cancel is executed.
- When the TC repeatedly invokes the two-phase interface, the participant will first obtain the corresponding record of the transaction status control table to check its transaction status. If the status is CONFIRMED/ROLLBACKED, it indicates that the participant has completed his/her work and does not need to perform it again. The participant can directly return the idempotent success result to the TC to advance the distributed transaction.
Empty the rollback
- The two-phase Cancel method is called when the participant’s Try method is not called, and the Cancel method needs to have some way of identifying whether the Try is executing or not. If the Try has not been executed, the Cancel operation is invalid and the Cancel is an empty rollback. If the Try is executed, then normal rollback logic is performed.
- To deal with the empty rollback problem, participants need to have a way in the two-stage Cancel method to identify whether the first-stage Try has been executed. Obviously, you can continue to do this with transaction state control tables.
- When the Try method is successfully executed, a record is inserted indicating that the branch transaction is INIT. Therefore, when the two-stage Cancel method is called later, it can be judged by querying the corresponding record of the control table. If the record exists and its status is INIT, phase 1 has been successfully executed. You can roll back the record to release reserved resources. If no record exists, the phase is not executed, and no resources are released.
Resources suspension
- Problem: TC rollback transaction invocation completes empty rollback in phase 2, phase 1 execution succeeds
- Solution: Transaction state control records as a control means, insert records when no records are found in the second stage, and check whether records exist during the execution of the first stage
TCC compared to 2PC
- 2PC is usually handled at the DB level of cross-library, while TCC is handled at the application level and needs to be realized by business logic. The advantage of this distributed transaction implementation mode is that applications can define the granularity of data operations by themselves, which makes it possible to reduce lock conflicts and improve throughput
- However, the disadvantage is that it is very intrusive to the application. Each branch of the business logic needs to implement the Try, confirm and cancel operations. In addition, it is difficult to implement. Different rollback strategies need to be implemented according to network status and different failure causes of system faults
The Hmily framework implements TCC columns
A # accountIdempotent effect of a try The suspension of a try checks whether the balance is $30 minus $30 Confirm is null Cancel Idempotent effect CACEL empty rollback process increases available balance by $30, rollback operationB # accountIdempotent effect add 30 yuan to confirm add 30 yuan to idempotent effect cancel cancelCopy the code
RocketMQ implements the ultimate consistency of reliable messages
- Reliable message ultimate consistency is the consistency that ensures the delivery of messages from producers through messaging middleware to consumers
- RocketMQ addresses two main features: local transactions and atomicity of message sending. Reliability of messages received by transaction participants
- The final consistency transaction of reliable message is suitable for the scenario with long execution period and low real-time requirement. After the introduction of message mechanism, the synchronous transaction operation is changed into asynchronous operation based on message execution, avoiding the influence of synchronous blocking operation in distributed transaction, and realizing the decoupling of the two services
Best effort notice
What is the difference between best effort notification and reliable message consistency
- Reliable message consistency: the notifier must ensure that the message is sent and sent to the notifier, and the reliability of the message is guaranteed by the notifier
- Notification with maximum effort. The sender tries its best to notify the recipient of the service processing result. However, the recipient may fail to receive the message
The application scenarios of both
- Reliable message consistency is concerned with transaction consistency in the transaction process, completing the transaction in an asynchronous manner
- Best effort notification is concerned with post-transaction notification transactions, i.e., notification of reliable transaction results
The MQ-based ACK mechanism implements maximum effort notification
- MQ uses its ACK mechanism to send message notifications from MQ to the notifier, and the originator sends ordinary messages to MQ
- Receive notifications to listen to MQ, receive messages, and respond to ACK upon completion of business processing
- If the receiving party does not respond to an ACK, MQ repeats the notification and increments the notification interval
- This scheme is suitable for notification between internal microservices, not for notification with external platforms
Plan 2: add a notification service area for notification, applicable when providing external third parties
Comparative analysis of distributed transaction schemes
2PC
The biggest one is a blocking protocol. The RM waits for TM’s decision after executing the branch transaction, and the service blocks to lock the resource. Due to its high blocking mechanism and worst-time complexity, this design cannot adapt to the need to expand as the number of services involved in a transaction increases, and is difficult to be used in distributed services with high concurrency and long sub-transaction life cycle- If you take
TCC
The transaction processing process is compared with 2PC two-stage commit. 2PC is usually handled at the DB level across libraries, while TCC is handled at the application level and needs to be realized by business logic. The advantage of this distributed transaction is that it allows you toApplying the granularity of custom data operations makes it possible to reduce lock conflicts and increase throughput
. The disadvantage is that it is very intrusive to the application, and each branch of the business logic needs to implement three operations. In addition, it is difficult to implement, and different strategies need to be implemented according to different failure reasons such as network status and system failure. Reliable messages are ultimately consistent
Transactions are suitable for scenarios with long execution cycles and low real-time requirements. After the message mechanism is introduced, the synchronous transaction operation is changed into asynchronous operation based on message execution, avoiding the influence of synchronous blocking operation in distributed transaction, and realizing the decoupling of the two services. Typical scenarios are registration and sending points, login and sending coupons, etcBest effort notice
Is a kind of distributed transaction requires minimum, suitable for some eventual consistency time sensitivity low business, allowing a notify party business processing failure, upon receiving notification receipt notification, the failure treatment actively, whatever initiated notifying party how to deal with the results will not affect the notified party subsequent processing, by notifying party shall provide the query execution interface, It is used for the receiver to proofread the result, typical application scenarios: bank notice, payment result notice, etc.
2PC | TCC | Reliable sources | Best effort notice | |
---|---|---|---|---|
consistency | Strong a laymen | The final agreement | The final agreement | The final agreement |
throughput | low | In the | high | high |
Implementation complexity | easy | difficult | In the | easy |
Case column Demo source address
- Github.com/qinxuewu/bo…