Directory:
- Basic Concepts (This chapter)
- Distributed transaction theory
- Distributed transaction solution 2PC
- TCC for distributed transaction solutions
- Reliable message final consistency for distributed transaction solutions
- Best efforts notification for distributed transaction solutions
- Distributed transaction synthesis case study
Basic concept
Juejin. Cn/post / 684490…
Basic theory of distributed transactions
From the previous study, we learned the basic concepts of distributed transactions. Unlike local transactions, distributed systems are called distributed because the nodes that provide services are distributed on different machines and interact with each other through the network. The whole system cannot provide services because of a few network problems. Network factors become one of the criteria for the consideration of distributed transactions. Therefore, distributed transactions need further theoretical support. Next, we will first study the CAP theory of distributed transactions. Before explaining distributed transaction control solutions, we need to learn some basic theories, through which to guide us to determine the target of distributed transaction control, so as to help us understand each solution.
2.1 theory of the CAP
2.1.1. Understand the CAP
CAP is the abbreviation of Consistency, Availability, and Partition tolerance, respectively. In order to facilitate the understanding of CAP theory, we combine some business scenarios in e-commerce system to understand CAP. The following figure shows the execution process of commodity information management:
The overall execution process is as follows:
- The goods service requests the master database to write the goods information (add goods, modify goods, delete goods)
- The primary database successfully wrote to the commodity service response.
- The commodity service request reads the commodity information from the database.
C – Consistency:
Consistency means that the read operation after the write operation can read the latest data status. If the data is distributed on multiple nodes, the data read from any node is the latest data status.
In the figure above, the consistency of reading and writing of commodity information is to achieve the following objectives:
- If writing to the master database succeeds, querying new data from the slave database succeeds.
- If writing goods and services to the master database fails, querying new data from the slave database also fails.
How to achieve consistency?
- After writing to the master database, data is synchronized to the slave database.
- After data is written to the master database, the slave database must be locked during data synchronization to the slave database. Release the lock after the synchronization is complete. Otherwise, old data can be queried from the slave database after new data is written to the master database.
Characteristics of distributed system consistency:
- Due to the process of data synchronization, the response of write operations is delayed.
- To ensure data consistency, resources are temporarily locked and released after data synchronization is complete.
- If a node fails to synchronize data, it will return an error message and must not return old data.
A – the Availability:
Availability means that any transaction operation can get response results without response timeouts or response errors.
In the figure above, reading commodity information to meet availability is to achieve the following objectives:
- The data query request received from the database can immediately respond to the data query result.
- Response timeouts or errors are not allowed from the slave database.
How is availability achieved?
- After writing to the master database, data is synchronized to the slave database.
- To ensure the availability of the slave database, resources in the slave database cannot be locked.
- Even if the data has not been synchronized, the data to be queried must be returned from the database, even if it is old data. If there is no old data, the default information can be returned according to the convention, but cannot return an error or response timeout.
Characteristics of distributed system availability:
- All requests are responded to without response timeouts or errors.
P-partition tolerance:
Usually, each node of a distributed system is deployed in different subnets, which is network partition. It is inevitable that communication between nodes will fail due to network problems, but they can still provide services to the outside, which is called partition tolerance.
In the figure above, reading and writing of commodity information to meet partition tolerance is to achieve the following goals:
- The failure to synchronize data from the primary database to the secondary database does not affect read and write operations.
- The failure of one node does not affect the external services provided by the other node.
How to achieve partition tolerance?
- Use asynchrony instead of synchronous operations, such as asynchronously synchronizing data from the primary database to the slave database, so that loose coupling between nodes can be achieved.
- Add slave database nodes where one slave node suspends the other slave nodes to provide service.
Characteristics of distributed partition tolerance:
- Partition tolerance is a basic capability of a distributed system.
2.1.2.CAP combination mode
1. Does the above commodity management example also have CAP?
All distributed transaction scenarios do not have the three features of CAP at the same time, because C and A cannot coexist under the premise of P.
Such as:
In the following figure, meeting P means achieving partition tolerance:
The meanings of partition tolerance in this figure are:
- The primary database synchronizes data to the secondary data over the network. The primary and secondary databases are deployed in different partitions and interact with each other over the network.
- The network failure between the primary and secondary databases does not affect the external services provided by the primary and secondary databases.
- The failure of one node does not affect the external services provided by the other node.
To implement C, ensure data consistency. During data synchronization, to prevent inconsistent data from being queried from the secondary database, lock the secondary database data and unlock it after the synchronization is complete. If the synchronization fails, return error information or timeout information to the secondary database.
If A is to be implemented, data availability must be guaranteed, so that data can be queried from slave data at any time and no response times out or error messages are returned.
Through analysis, it is found that C and A are contradictory on the premise that P is satisfied.
2. What are the combinations of CAP?
Therefore, when dealing with distributed transactions in production, it is necessary to determine which two aspects of CAP are satisfied according to the requirements.
1) AP: Abandon consistency in favor of partition tolerance and availability. This is the design choice of many distributed systems.
For example: the above commodity management, can achieve AP, the premise is as long as the user can accept the query to the data within a certain period of time is not the latest.
Generally, AP will guarantee the final consistency. The BASE theory described later is extended based on AP. Some business scenarios include order refund, successful refund today, account arrival tomorrow, as long as the user can accept the account arrival within a certain period of time.
2) CP: Abandon availability and pursue consistency and fault tolerance of partitions. Our ZooKeeper actually pursues strong consistency. Another example is inter-bank transfer, a transfer request is not completed until the whole transaction is completed by both banks.
3) CA: give up partition tolerance, that is, do not partition, do not consider the problem due to network connectivity or node hang up, then consistency and availability can be achieved. The system would not be a standard distributed system, and the relational data we use most often would satisfy the CA.
The above commodity management, if you want to implement CA architecture is as follows:
There is no data synchronization between the master database and the slave database. The database can respond to each query request, and each query request can return the latest data through the transaction isolation level.
2.1.3 summary
Through the above, we have learned relevant knowledge of CAP theory, CAP is a proven theory: A distributed system can satisfy at most two of Consistency, Availability, and Partition tolerance at the same time. It can be considered as a standard for architecture design and technology selection. For most large-scale Internet application scenarios, there are many nodes, scattered deployment, and the current cluster scale is getting larger and larger, so node failure and network failure are normal, and service availability should be guaranteed to reach N 9 (99.99.. %), and to achieve good response performance to improve user experience, the following choices are generally made: ensure P and A, abandon C strong consistency to ensure final consistency.
2.2. The BASE theory
Understand strong consistency and ultimate consistency
CAP theory tells us that a distributed system can only meet at most two of Consistency, Availability and Partition tolerance at the same time. AP is more common in practical applications, and AP means to abandon Consistency. Ensure availability and partition tolerance, but in actual production, consistency should be realized in many scenarios. For example, in the previous example, the master database synchronizes data to the slave database. Even if there is no consistency, data must be successfully synchronized to ensure data consistency, which is different from consistency in CAP. Consistency in CAP requires that the data of each node must be consistent when queried at any time. It emphasizes strong consistency, but the final consistency allows the data of each node to be inconsistent within a period of time, but after a period of time, the data of each node must be consistent, which emphasizes the consistency of the final data.
2. Introduction to Base theory
BASE is an acronym for Basically Available, Soft state, and Eventually consistent. BASE theory is an extension of AP in CAP, which achieves availability by sacrificing strong consistency. When failure occurs, part of the data is allowed to be unavailable but core functions are ensured to be available. Data is allowed to be inconsistent for a period of time, but eventually reaches a consistent state. The transactions satisfying BASE theory are called “flexible transactions”.
- Basic availability: When a distributed system fails, it allows the loss of some available functions to ensure the availability of core functions. For example, e-commerce website transaction payment problems, goods can still be viewed normally.
- Soft state: Since strong consistency is not required, BASE allows the existence of intermediate states (also called soft states) in the system. This state does not affect system availability, such as “payment in progress”, “data synchronization in progress”, etc. After data consistency, the state will be changed to “success”.
- Final consistency: Indicates that data on all nodes will be consistent after a period of time. For example, the “payment in progress” status of the order will eventually change to “payment success” or “payment failure”, so that the order status and the actual transaction result are consistent, but need a certain time delay, waiting.
In order not to read boring, in this only wrote the content of the second section, the content behind, will be updated by the chapter, you can pay attention to my continuous reading, feel good can point a thumb-up support!