“This is the 17th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

Database transaction issues and Spring transaction management have been thoroughly understood in the previous article, so in this article we will discuss distributed transactions and why we should avoid them.

1. Introduction to distributed transactions

1.1 Introduction to Distributed

First of all, the development of distributed is inseparable from the development of cloud computing, micro-services, sub-database sub-table and other technologies.

Distributed generally refer to distributed services, the so-called distributed service is our service deployed on different machines (physical machine, virtual machine), let them together to provide services for us, the expansion of the distributed service can be said to be the single node and single node service ability is limited, we improve the ability in services by expanding nodes.

1.2 Introduction to transactions

A transaction generally refers to a database transaction, and the same transaction operation can ensure that it has the four characteristics of ACID.

  • Atomicity: It’s all success or all failure.
  • Consistency: AID is all the characteristics of database processing. Consistency is to emphasize the application system from one correct state to another correct state. AID can be said to ensure C.
  • Isolation: Multiple transactions processing the same data at the same time are isolated from each other without affecting each other.
  • Durability: Transactions complete, persist results to disk.

1.3 Distributed Transactions

If the same service operates on the same database, Spring does a great job of solving the transaction problem for us, in which case there are no distributed transactions.

What are distributed transactions?

Distributed transaction means that when database transaction cannot guarantee data consistency, other means are used to ensure data consistency.

The consistency here needs to refer to CAP theory and BASE theory, which can be strong consistency or final consistency.

Distributed transactions can be attributed to two reasons:

  • Multiple nodes for application services

    From early SOA to micro service now, application of fine-grained resolution become a trend, different business ability is split into multiple services, some of the business are interconnected, such as goods, orders, integral and assigned to different services, but the user place the order, will need to deduct inventory, increase user integral, deducting the balance, This ordering process will result in calls between services. Therefore, we need to ensure the consistency of these operations, which requires distributed transactions.

  • Multiple nodes for database services

    With the development of the Internet, the amount of data owned by each company is also soaring, and the technology of database and table is becoming increasingly mature. For MySQL database, when the amount of single table data reaches 50 million, database and table need to be divided. Transaction is the database level, but the operation of different databases will not be able to guarantee the same transaction, need to use distributed transaction to deal with.

2. Distributed transaction solutions

This article will not be devoted to a distributed transaction solution, but will be devoted to an in-depth article later.

  • Phase 2 Submission (2PC)

    • XA

      XA is a distributed transaction protocol proposed by Tuxedo. XA is roughly divided into two parts: the transaction manager and the local resource manager. The local resource manager is implemented by the database, such as the mainstream database Oracle, Mysql and other databases have implemented XA interface, and the transaction manager, as the global scheduler, is responsible for the submission rollback of each local resource.

    • AT

      An implementation in Seata, a global transaction, is non-intrusive.

  • Three-phase Submission (3PC)

    3PC actually added a CanCommit phase to 2PC, which is a variant of 2PC, and introduced a timeout mechanism.

  • Compensating transactions

    • Saga

      Record the rollback scheme. If a forward operation fails, the distributed transaction performs the reverse rollback of previous participants to the initial state.

    • TCC

      TCC is implemented based on business logic and is highly intrusive to code. It is similar to 2PC, but needs to be implemented by the application itself. All business logic needs to implement the try, Confirm, and Cancel operations.

  • Local message table (final consistency)

    Adding a message table to the database, by changing the state, adding a scheduled task, similar to the message transaction, is also to ensure ultimate consistency

  • Message transactions (final consistency)

    Message transactions are distributed transactions with the help of message middleware to ensure final consistency.

Seata is actually an implementation of the above methods. There are four modes of Seata: AT, XA, Saga, TCC. AT and XA are all based on 2PC.

3. Avoid distributed transactions

In fact, the performance of distributed transactions is very low. Often, after our service is split, a large number of distributed transactions are added, which seems to be very high. But used distributed transaction didn’t know how many, the pit of a distributed transaction database transaction itself is very expensive, we also strengthened the complexity of the transaction, not only the service response speed will be affected, the error is also difficult to troubleshoot, sometimes things should have rolled back no rollback, or partial rollback is very annoying.

So it’s important to avoid distributed transactions and over-design.