Distributed transaction implementation
- There are Two Phase Commitment Protocol, Three Phase Commitment Protocol and Paxos algorithm.
- X/Open DTP model (1994) includes application program (AP), transaction manager (TM), resource manager (RM) and communication Resource Manager (CRM). In general, a common transaction manager (TM) is transaction middleware, a common resource manager (RM) is a database, and a common communication resource manager (CRM) is messaging middleware. The second – order commit protocol and third – order commit protocol are derived from this idea.
- Paxos algorithm.
- Saga.
The second order submission
- There are two stages: the first stage: the preparation stage (voting stage) and the second stage: the submission stage (implementation stage).
- In the preparation phase, the coordinator passes information with the transaction to the other participants, asks them if they can commit the transaction, and waits for the results to return.
- In the commit phase, as soon as any participant returns an execution failure, the transaction is interrupted, the coordinator sends a rollback request to all participants, and the participants rollback.
- Existing problems:
- Synchronization blocking problem. During execution, all participating nodes are transaction blocking. When a participant occupies a common resource, other third party nodes have to block access to the common resource.
- Single point of failure. Because of the importance of the coordinator, if the coordinator fails. The participants will keep blocking. Especially in phase 2, when the coordinator fails, all participants are still locked in the transaction resources and cannot continue to complete the transaction. (If the coordinator is down, you can re-elect a coordinator, but you can’t solve the problem of participants being blocked because the coordinator is down)
- The data are inconsistent. During the two-phase commit phase, after the coordinator sends a COMMIT request to the participant, a local network exception occurs or the coordinator fails during the commit request, resulting in only a subset of participants receiving the commit request. These participants perform the COMMIT operation after receiving the COMMIT request. However, other parts of the machine that do not receive the COMMIT request cannot perform the transaction commit. So the whole distributed system will appear data inconsistency phenomenon.
- The problem that phase 2 failed to resolve: The coordinator went down after issuing a COMMIT message, and the only participant who received the message went down at the same time. So even if the coordinator creates a new coordinator by election agreement, the status of the transaction is uncertain, and no one knows whether the transaction has been committed.
The third order submission
- 3PC splits the preparation phase of 2PC into two phases again, so there are three phases of CanCommit, PreCommit, and DoCommit.
- CanCommit: the coordinator issues a request containing a transaction, asks the participant if he or she CanCommit the transaction, and waits for the result to return.
- PreCommit: If any participant returns an error or a timeout, the transaction is interrupted on the third channel. If all return normally, the transaction commit is performed in step 3.
- DoCommit: If an interrupt operation is received, an interrupt request is sent to all participants to perform a rollback.
- Existing problems:
- Compared to 2PC, 3PC mainly solves single points of failure and reduces blocking, because an actor defaults to commit once he fails to receive information from the coordinator in a timely manner. Transaction resources are not held and blocked all the time. However, this mechanism can also cause data consistency problems because, due to network reasons, the abort response sent by the coordinator is not received in time by the participant, and the participant performs the COMMIT operation after waiting for a timeout. This results in data inconsistencies with other participants who receive abort and perform rollback.
- Tips: Knowing 2PC and 3PC, neither 2-phase commit nor 3-phase commit completely solves the distributed consistency problem.
Complex Paxos algorithm – There are many articles on Paxos algorithm on the Internet, there is no need to go into detail, here is a brief introduction
- Paxos algorithm solves the final consistency problem. It contains two roles => proposer and receiver
- For example, if a cluster has multiple nodes, how does Paxos get these nodes to agree on their data? There are two steps to do this:
- The first stage is elections, and no proposals have been put forward. At this stage, all of us (multiple nodes —
The receiver
) fromProposal is
To elect a leader who can lead us to make proposalsThe serial number
To identify who is more electable (the higher the serial number, the higher the electability). - The second stage is mainly based on the results of the first stage, to clearly accept the proposal of the elected nodes and clarify the content of the proposal.
- Here,
The serial number
Very important, at any stage, the one with the small serial number will be refused to elect him. In the first phase, onceThe recipient
Have accepted previously elected onesProposal is
Offer. Then we’ll look for this laterThe recipient
theProposal is
Even if he is elected in the first stage, he will be forced to modify his proposal to the proposal of the previously elected leader. Then he will put forward the proposal of the previous leader in the second stage to try to make everyone’s opinions converge. ifThe recipient
Hasn’t taken any offers before, the newly electedProposal is
You can make your own proposal.
Reference: The principle of the first article is very thorough, the example of the second Zhihu article is very easy to understand
- Distributed system Paxos algorithm
- How to explain the Paxos algorithm in a straightforward way?
Saga (Unfinished)
Requirements and Limitations
- In distributed systems, services that need to be invoked by Saga support idempotent because of the latency of network requests. That is, the result of multiple saga calls is the same.
- Only ACD is guaranteed, data isolation is not guaranteed
- Data semantics are inconsistent when operating on a resource simultaneously
- Operating on one data at a time causes operation overwriting
- Dirty read. For example, when modifying some data, another saga reads the data, causing misreading of the data before modification
- Fuzzy reads, where a saga modifies the data during the reading process, causing the two reads to be inconsistent
- Solution: Add a Session and lock mechanism from the business level to ensure the serialization of operation resources. You can also isolate these resources at the service level and obtain the latest updates by reading the current status during service operations.
- When multiple Saga transactions occur and one transaction fails, how can we ensure effective rollback?
- Forward Recovery: By monitoring Saga transactions, administrators can check whether the transaction is successful and retry automatically or manually.
- Fallback: Give each microservice a compensation mechanism for failure by manually adding supplementary events.
- Choreography of Saga transactions: to be added
Choerodon’s Saga solution — with Kafka
- In Saga, the toothfish assigns a orchestrator to Saga as a transaction manager, and when the service starts, all sagatasks in the service are registered with the manager. When a Saga instance is started, service consumers can actively pull the Saga data corresponding to the instance through polling and execute the corresponding business logic. The status of the execution can be viewed through the transaction manager and displayed on the interface.
- The idempotency here is achieved by recording execution instances, each SagaTask being logged in the instance event, and not the next time such instances are generated.
- The solution of data isolation is simplified to the transaction management of A single service. For example, when both A micro service and B micro service need to change the data in the service C at A certain point, two Saga data will be generated. Then C pulls these Saga data and operates them in the database (which is equivalent to handing over data isolation to the database for processing).
- The toothfish is currently implementing a forward recovery strategy. If there are better improvement strategies, we will share them in time.
- Orchestration of transactions: To be added
Reference:
Choerodon Microservice data consistency Solution in the Toothfish platform