In-depth understanding of distributed transaction solutions with high concurrency

1. What are distributed transactions

Distributed transaction means that transaction participants, transaction supporting servers, resource servers, and transaction managers are located on different nodes of different distributed systems. To put it simply, a large operation consists of different small operations, which are distributed on different servers and belong to different applications. Distributed transactions need to ensure that all of these small operations either succeed or fail. Essentially, distributed transactions are designed to ensure data consistency across different databases.

2. Causes of distributed transactions

2.1. Database classification and table

When the database single table a year to produce more than 1000W data, then it is necessary to consider the sub-database sub-table, the specific sub-database sub-table principle is not explained here, later free to say in detail, simply speaking is the original a database into a number of databases. In this case, if an operation accesses both 01 and 02 libraries and the data is consistent, then distributed transactions are used.

2.2 soA-BASED application

Soa-ization is the servitization of business. For example, the original single support of the entire e-commerce website, now the whole website is disassembled, separated from the order center, user center, inventory center. For the order center, there is a special database to store the order information, the user center also has a special database to store the user information, and the inventory center also has a special database to store the inventory information. At this time, if you want to operate the order and inventory at the same time, it will involve the order database and inventory database, in order to ensure data consistency, it needs to use distributed transactions.

The above two cases appear different, but the essence is the same, because there are more databases to operate!

3. ACID properties of transactions

3.1 Atomicity (A)

Atomicity means that all operations in the entire transaction either complete or do not complete, with no intermediate states. If a transaction fails during execution, all operations are rolled back, as if the entire transaction had never been executed.

3.2 Consistency (C)

Transaction execution must ensure the consistency of the system. Take transfer as an example. A has 500 yuan and B has 300 yuan.

3.3 Isolation (I)

The so-called isolation means that transactions do not affect each other, and the intermediate state of a transaction is not perceived by other transactions.

3.4 Persistence (D)

Persistence means that once a transaction is completed, the changes made by the transaction to the data are completely preserved in the database, even if there is a power outage or system downtime.

4. Application scenarios of distributed transactions

4.1, payment

The classic scenario is a payment, a payment, which is a deduction to the buyer’s account and an addition to the seller’s account, and these operations must be performed in a transaction, and either they all succeed or they all fail. As the buyer account belongs to the buyer center, it corresponds to the buyer database, while the seller account belongs to the seller center, it corresponds to the seller database. Therefore, the operation of different databases must introduce distributed transactions.

4.2 place orders online

When buyers place orders on e-commerce platforms, two actions are often involved, one is to withhold inventory and the other is to update order status. Inventory and order generally belong to different databases, so distributed transactions are required to ensure data consistency.

5. Common distributed transaction solutions

5.1 Two-phase commit based on XA protocol

XA is a distributed transaction protocol proposed by Tuxedo. XA is roughly divided into two parts: the transaction manager and the local resource manager. Among them, the local resource manager is often implemented by the database, such as Oracle, DB2 and other commercial databases have implemented XA interface, and the transaction manager as the global scheduler, responsible for the submission and rollback of each local resource. XA implements distributed transactions as follows:

In general, the XA protocol is relatively simple, and once commercial databases implement it, the cost of using distributed transactions is low. However, XA also has a fatal disadvantage, that is, the performance is not ideal, especially in the transaction of single link, often high concurrency, XA can not meet the high concurrency scenario. Currently, XA is well supported by commercial databases, but not so well supported by mysql databases. The XA implementation of mysql does not record logs in the prepare phase, causing data inconsistency between the active and standby databases due to the switchover between the active and standby databases. Many NoSQL also don’t support XA, which makes for a very narrow application scenario for XA.

5.2. Message transaction + Final consistency

So-called news two-phase commit transaction is based on message oriented middleware, is essentially the message middleware is a kind of special use, it is the local affairs, and sending messages in a distributed transaction, ensure success or local operation is successful and foreign successful message, either both failure, open source RocketMQ will support this feature, the principle is as follows:

1. System A sends A preparatory message to the message middleware

2. The message middleware saves the prepared message and returns success

3. A performs local transactions

4. A sends A submission message to the message middleware

The above four steps complete a message transaction. Errors may occur in each of the above four steps. The following is an analysis of each step:

If step 1 fails, the entire transaction fails and local operation of A is not executed. If Step 2 fails, the entire transaction fails and local operation of A is not executed. In this case, the prepared message needs to be rolled back. The answer is that the system A implements A callback interface of the message-ware, and the message-ware will continuously execute the callback interface to check whether the transaction execution of A is successful. If it fails, step 4 of the preparatory message will be rolled back. At this time, the local transaction of A is successful, so should the message-ware roll back A? The answer is no, in fact, through the callback interface, the message middleware can check the successful execution of A, at this time, there is no need for A to send A submission message, the message middleware can submit the message itself, so as to complete the whole message transaction. Two-phase submission based on message middleware is often used in high concurrency scenarios. A distributed transaction into A message transaction (A system of local operating + send messages) + B local operation of the system, and the B system driven by news operation, as long as the message transaction is successful, then A success operation, message must be sent to, then B will receive A message to perform local operation, if the local operation fails, A message will be back, Until B succeeds, the distributed transaction between A and B is realized in A disguised way. The principle is as follows:

Although the above scheme can complete the operations of A and B, A and B are not strictly consistent, but ultimately consistent. Consistency is sacrificed here in exchange for A significant improvement in performance. Of course, there are risks to this gameplay, and if B is consistently unsuccessful, the consistency will be broken, depending on how much risk the business can take.

5.3 TCC programming mode

The so-called TCC programming pattern is also a variant of two-phase commit. TCC provides a programming framework that divides the entire business logic into three parts: Try, Confirm, and Cancel operations. Taking online orders as an example, inventory will be deducted in the Try stage, and order status will be updated in the Confirm stage. If the order update fails, inventory will be restored in the Cancel stage. In short, TCC is an artificial two-phase commit implemented by code. Different business scenarios write different code with different complexity, so this model does not reuse well.

6, summary

Distributed transaction, in essence, is the unified control of the transactions of multiple databases. According to the control strength, it can be divided into: no control, partial control and complete control. No control means not introducing distributed transactions, partial control means various variants of two-phase commit, including the above mentioned message transaction + final consistency, TCC mode, and full control means fully implementing two-phase commit. The advantage of partial control is that concurrency and performance are good, while the disadvantage is that data consistency is weakened. Full control sacrifices performance to ensure consistency, which method is used ultimately depends on the business scenario. As a technical personnel, we must not forget that technology is for business services, not for the sake of technology and technology, technology selection for different businesses is also a very important ability