The transaction

Let’s take a simple example in life: for example, if you go to a restaurant for dinner, you have to pay for the meal. If you do not pay for the meal, the restaurant owner will not agree. If you pay for the meal, but the boss does not cook for you, then you do not agree. So this is a scenario where both sides have to be successful for the whole process to end. So the definition of a transaction is:

A transaction can be thought of as a single event consisting of several operations, all of which either succeed or fail.

Local transactions

Local transaction refers to database transaction under our traditional singleton service. Let’s review the four characteristics of database transaction: ACID

  • A (Atomic) : atomicity. All operations that make up A transaction are either completed or not executed. There can be no partial success and partial failure.
  • C (Consistency) : Consistency. Before and after a transaction is executed, the Consistency constraint of the database is not damaged.
  • I (Isolation) : Isolation. Transactions in a database are generally concurrent. Isolation means that the execution of two concurrent transactions does not interfere with each other, and one transaction cannot see the intermediate status of other transactions. You can configure the transaction isolation level to avoid dirty reads and duplicate reads.
  • D (Durability) : Specifies that changes made to data by a transaction are persisted to the database and cannot be rolled back after the transaction completes.

For example, we often use the @Transactional annotation in the business layer to open transactions.

As we moved from singleton services to distributed services, we found that local transactions under singleton services failed. As shown above, two or more service network calls for a remote call, will appear service 1 success but 2 failed, so the service can guarantee transaction for rollback, but for the entire call link has been inconsistent data, such as 1 is the product service, service is 2 orders, order service is successful, But the product service does not deduct the product. This leads to data errors. So this situation needs to be handled by our distributed transactions.

Distributed transaction

Distributed system will apply a system into multiple services can be developed independently, so you need to remote collaboration between services and services to complete the transaction operations, this kind of distributed system environment by remote collaboration to complete the transaction through the network between different service is called a distributed transaction, create order reduction product inventory transactions, for example, bank transfer transaction is a distributed transaction. CAP is an acronym for Consistency, Availability, and Partition tolerance.

  • C-consistency: Indicates that the read operation after the write operation can read the latest data status. If the data is distributed on multiple nodes, the data read from any node is the latest data status.

Characteristics of distributed system consistency: 1. Due to the process of data synchronization, the response of write operation will have a certain delay. 2. To ensure data consistency, resources are temporarily locked and released after data synchronization is complete. If the node fails to synchronize data, it will return an error message.

  • A-availability: Indicates that any transaction operation can receive response results without response timeout or error.

Characteristic of distributed system availability: All requests have a response.

  • P-partition tolerance: Each node of a distributed system is deployed in different networks for network Partition. It is inevitable that communication failure will occur among nodes due to network problems, but they can still provide services externally at this time. This is called Partition tolerance.

A distributed system can satisfy at most two of Consistency, Availability, and Partition tolerance at the same time. For most large-scale Internet application scenarios, there are many nodes and scattered deployment. Therefore, the following choices are generally made: Ensure P and A, and abandon strong consistency of C to ensure final consistency.

Two-phase commit

Two-phase commit divides the entire transaction process into two phases: Prepare Phase and Commit phase. 2 refers to two phases, P refers to the preparation phase, and C refers to the commit phase.

Another example in life: if there are two people in your project team, the two of you report the project results to the boss. In the preparation stage, the boss requires both of you to report the results at the same time. In the submission stage, the boss is satisfied and gives you a bonus. It’s a business. If one person doesn’t report, the boss doesn’t give him a bonus. The whole transaction process is composed of the transaction manager and participants. The boss is the transaction manager, and the two project team members are the transaction participants. The transaction manager is responsible for making decisions on the submission and rollback of the entire distributed transaction, and the transaction participants are responsible for the submission and rollback of their own local transactions.

Three-stage commit

The three-phase commit is based on the two-phase commit, adding a “CanCommit” query phase before the commit. Before the transaction coordination group sends the transaction request, it seeks to ask whether the instruction can be completed. There is no real transaction operation in this process, and this process may have timeout and cause the transaction commit to terminate. Here’s a picture to help you understand:

Common solution for distributed transactions

TCC

TCC is an abbreviation of Try, Confirm, and Cancel. TCC requires each branch transaction to implement three operations: preprocessing a Try, confirming a Confirm, and revoking a Cancel. The Try operation is a service check and resource reservation operation, Confirm is a service confirmation operation, and Cancel implements a rollback operation that is the opposite of the Try operation. TCC is divided into three phases:

  1. The Try stage is a business check (consistency) and resource reservation (isolation). This stage is only a preliminary operation, which together with subsequent Confirm can really form a complete business logic.
  2. The Confirm stage is a confirmation commit, and the Try stage starts Confirm execution after all branch transactions are successfully executed. In general, the Confirm stage is assumed to be error-free when TCC is used.
  3. In the Cancel phase, a branch transaction is cancelled and reserved resources are released when a service execution error needs to be rolled back. In general, the Cancel phase is considered a certain success when TCC is used.

Final consistency

Final consistency is the most widely used solution for distributed transactions by companies today, because it can be used in most scenarios to ensure the final consistency of multiple services. We can use the reliability mechanisms of messaging-oriented middleware such as Kafka and rocketMQ to deliver data. For example, in the payment scenario, after the payment is initiated, the third-party payment platform asynchronize the notification result, and then points are added after the payment status is set according to the status. Therefore, after the payment callback is successful, we will send an MQ message to the middleware for notification and points service to complete data synchronization.

Best effort notice

In fact, this maximum effort notification is similar to the final consistency scheme above. It is suitable for business scenarios with low requirements on data consistency and notifications are made through continuous callback requests. Here’s a scenario:

  1. After paying the order, call Alipay for payment
  2. Alipay records the status of users based on their payments
  3. Alipay will call back the merchant, and the merchant will change the order status after knowing the notification and return a status to Alipay to inform Alipay that I have received the order
  4. Ideally, in the above process, if there is an anomaly in the middle, alipay notifies the merchant of failure and the result is lost, then the maximum effort notice will appear
  5. Alipay will call the callback interface of the merchant once every 1, 5, 10 and 30 minutes for notification. If the merchant has not returned successfully, it can be handled by scheduled task processing or manual processing

Alibaba Seata

Seata is an open source distributed transaction solution for Alibaba with minimal intrusion into business code. At its core is a annotation “@GlobalTransactional”. Seata has three basic components:

  • Globally unique transaction ID: Multiple libraries under the same ID constitute a transaction.
  • Transaction coordinator: Maintains the state of global and branch transactions and drives global commit or rollback.
  • Transaction manager: defines the scope of the global transaction: starts the global transaction, commits or rolls back the global transaction.
  • Resource manager: Manages resources for branch transaction processing, talks to TCS to register branch transactions and report status of branch transactions, and drives commit or rollback of branch transactions.

  1. In the figure above, each TM registers a global transaction with TC and generates a globally unique XID
  2. RM registers the branch transaction with the TC and includes it in the global transaction corresponding to the XID
  3. RM reports resource readiness status to TC
  4. The TC summarizes the execution status of all transaction participants to determine whether a distributed transaction should be committed or rolled back
  5. The TC notifies RM to commit or roll back the transaction