Background: Zhao Min monarch carrying a gang of masters siege Wudang, Wudang faction head Zhang Sanfeng was plotted, passed a set of martial arts to Zhang Wuji used to deal with Zhao Min’s men. This set of martial arts is called Taijiquan.

Zhang Sanfeng: Wuji, how many moves can you remember?

Zhang Wuji: I forgot all about it!

Zhang Sanfeng: very good, you only want to remember the dark dark two old men dozen lie down ok.

It is quite interesting to talk about distributed Byzantine generals with the Killing of The Three Kingdoms in the last chapter. This time, we will talk about the remaining three theories with taijiquan in the Book of Relying on Heaven and Killing Dragons:

  • Theory of CAP
  • Theory of ACID
  • The BASE theory of

The essence of Taijiquan: to overcome the rigid, rigid and soft progress, four to two dial a thousand catties, no recruit to win.

I call the CAP theory tai Chi, the ACID theory Yang or gang, and the BASE theory Yin or soft. ACID theory pursues consistency, while BASE theory, originally called flexible transaction, pursues availability. Why did Zhang Wuji forget to beat the dark two? Because the essence of Taijiquan is the intention of boxing.

Two sides of Tai Chi

CAP theory is a high degree of abstraction of the characteristics of distributed system, into three indicators:

  • Consistency
  • Availability
  • Partition Tolerance

Consistency in distributed mode can be understood as that every read operation of a client, no matter which points it accesses, either supervises the same newly written data or fails to read. That’s pretty rigid, and I can’t say it’s bad, but in a lot of scenarios, you do need a high degree of consistency.

To help you understand consistency, let me give you an exampleThe Heaven Sword and the Dragon SabreStory:Six factions have laid siege to the Bright Top. Emei sendDestroy master tai as the command, leading the six rivers and lakes party siegeThe top of lightThe initial offensive strategy is fromTo the northThe attack. Extinction master tai discovered fromTo the northThe attack was not good, so pigeons sent a messagewudangAnd the Shaolin school fromTo the southThe order to attack, butShaolin sentThe flying pigeon was taught by the Ming Dynasty’s most successful green wing bat king Wei YixiaointerceptThe final result is that shaolin sects are subordinateTo the northAttack, wudang faction fromTo the southAttack? Wouldn’t that be a mess? As shown below:

1.1 Understand CAP in Distributed

How does CAP fit into a distributed system? Let me give you an example to help you understand.

  • Initial environment: The client queries or updates nodes 1 and 2, and the value A of the two nodes is 1.

  • The client updates the value of A on node 1 and sets A to 5.

  • Node 1 updates the value of node A to 5 and returns A successful update message to the client.

  • The client accesses node 2 and requests to obtain the value of A. The result is that A = 1. This is inconsistent with the value of A stored in node 1.

  • So how do you make sure that A is equal to 5 in both nodes? After the client updates node 1, node 2 also needs to be updated to tell the client that the update is successful.

  • After both nodes are updated successfully, the client accesses either node and obtains A = 5. This is called consistency.

Consistency emphasizes that data is correct, and each time data is read from a node, it is the most recent data written. This is what I call gang.

However, in the cluster environment of our production, if a partition failure occurs (the node is disconnected, the node cannot respond, and the node cannot write data), when the client queries the node, we cannot return an error message to the client. For example, some critical systems in a business cluster, such as a registry, cannot fail to respond to the latest data just because a node is disconnected. Then the relevant business can not obtain the correct registration information, resulting in system breakdown.

This is where availability comes in, with each node responding to client requests using local data at the expense of data accuracy. In addition, when a node is unavailable, you can use a quick failure strategy, at least not so that the service is not available for a long time. This is what I call Rou.

As shown in the following figure, the values returned by nodes 1 and 2 to the client are A = 5 and A = 1, respectively. In other words, the data consistency of nodes 1 and 2 is not guaranteed, but the availability of nodes is considered.

Partition fault tolerance means that the system continues to work when any number of messages are lost or high latency occurs between nodes. The distributed system tells the client that NO matter what kind of data synchronization problem I have internally, I will keep running. Emphasis is placed on the fault tolerance of cluster heap partition failures.

1.2 CAP triangle

So how do these three indicators relate? This is the CAP theory that we often hear about. C stands for Consistency, A for Availability, and P for Partition Tolerance.

For distributed systems, only two of the three CAP metrics can be selected.

  • CA: Ensures consistency and availability. When the distributed system is running properly (as it is most of the time), P is not required, and both C and A can be guaranteed. P is only needed in the event of A partition failure, at which point the choice is between C and A. Typical application: MySQL in standalone deployment.

  • CP: Ensure data consistency and fault tolerance of partitions. For example, ensure that each node stores the latest and correct data. Raft’s highly consistent system, for example, makes it impossible to perform reads and writes. Typical applications: Etcd, Consul, Hbase.

  • AP: Ensures availability and fault tolerance of distributed systems. When users access the system, they can get the corresponding data. There is no error response, but they may read the old data. Typical applications: Cassandra and DynamoDB.

Second, the rigidity of Tai Chi

Just 2.1 ACID

I first knew ACID when I was looking at SQL database, Atomicity, Consistency, Isolation, Durability.

These four attributes are specific to transactions, which are a series of operations performed for a single unit of work. Such as query, modify data, modify data definition.

Transactions are not only for databases, but can also be used in business systems, such as debiting inventory after issuing vouchers, which can be defined as a transaction. In the single-node scenario, ACID properties on a single node can be guaranteed by locking and time series, but ACID properties on inter-node operations cannot be guaranteed.

So how do you solve the transaction problem in distributed systems? This is also a common interview question. Distributed transaction protocol we must have heard of, such as two-phase commit protocol and TCC protocol, I still use six parties siege bright top story to explain two-phase protocol.

2.2 Siege of bright Top

Emei group wants to bring together shaolin group, Wudang group and Kunlun group to attack Guangming Ding tomorrow. If one side does not agree to attack, or if the timing of the attack is inconsistent, the entire operation plan needs to be cancelled. The action of Shaolin faction, Wudang faction and Kunlun faction attacking Guangming Ding can be regarded as a distributed transaction, which is either all executed or none executed. As shown below:

So how do you help extinction masters solve this synergy problem? We can illustrate this with a two-phase commit protocol.

2.3 Two-phase Submission protocol

In the two-stage submission agreement, the exterminator sent the news of the attack to the Shaolin group first, and the Shaolin group acted as the coordinator, and the Shaolin group contacted the Wudang group and the Kunlun Group whether to attack or retreat.

Two-stage means there are two stages: 1. Submit request stage (vote stage), 2. Submit execution stage (complete stage).

Stage 1: Request submission stage:

  • The first step: As the coordinator, The Shaolin group sent a message to the Wudang group and the Kunlun group: “Attack guangming Ding tomorrow, feasible?”
  • The second step: Shaolin faction, Wudang faction and Kunlun faction respectively evaluate whether they can attack bright top tomorrow. If they can, they will reserve time and lock up, and will not arrange other offensive matters.
  • The third step: Shaolin group received all the results of the response, including the results of shaolin group’s own evaluation. All three of the last threefeasible.

As shown below:

Phase 2: Submission execution phase:

  • The first step: Shaolin group statistics themselves, Kunlun group and Wudang group news, areYou can attack, so it can perform distributed transactions and attack the bright top.
  • Step 2: The Shaolin faction informs kunlun faction and Wudang Faction to attack Guangming Ding.
  • The third step: Shaolin school, Kunlun school, Wudang school called his disciples, attack guangming Ding (executive affairs).
  • The fourth stepGenerals of Kunlun school and Wudang SchoolWhether an attack has been launchedTell shaolin.
  • The fifth step: The Shaolin group summarizes their own, Kunlun group, Wudang group attack results to exterminate teacher Tai. So extermination division tai see is a unified battle plan.

Note:

  • You can use the exterminator as a client. Shaolin school, Wudang school, Kunlun school as the three nodes of the distributed system. Shaolin group acts as the coordinator.
  • Think of evaluating whether you can attack the bright top and reserve time as objects that need to be acted on and object states, whether you are ready, and whether you can commit new actions.
  • Sending messages and flying pigeons can be understood as network messages.
  • In the first phase, each participant votes on whether the transaction should be dropped or committed, and once the vote requires the transaction to be committed, the transaction is not allowed to be dropped.
  • In the second phase, each participant performs the final, unified decision to commit or abandon a transaction. This is the atomicity of ACID.
  • In the first phase, a resource needs to be reserved, during which no one else can manipulate the resource.

2.4 Problems caused by the two-stage agreement

ACID properties are the boundary of consistency in CAP and can be called the strongest consistency, and if consistency is implemented in a distributed system, availability is bound to be affected. If a node fails, the execution of the distributed transaction fails.

In most scenarios, the consistency requirement is not so high, and there is no need to ensure strong consistency. Transient inconsistencies can be accepted, and finally the data can be guaranteed to be correct. That is to say, we can use the final consistency scheme to ensure data consistency.

Another thing to mention is the TCC protocol (three-stage submission protocol), which is aimed at the pain points of coordinator failure and long-term locking of resources by participants in two-stage submission. The query phase and timeout mechanism are introduced to reduce resources being locked for a long time. But the need for more messages to negotiate increases the system load and response latency, so the three-phase commit protocol is rarely used.

Three, the softness of Tai Chi

3.1 soft BASE

Talking about the strength of Tai Chi, the following is about the softness of tai Chi. When it comes to the flexibility of distributed transactions, BASE theory, commonly known as flexible transactions, must be mentioned. BASE theory is an extension of AP in CAP theory. Most Internet distributed systems emphasize usability and consider the introduction of BASE support. This theory is very, very important, and I will tell you that it will make it much easier to design a distributed architecture that fits your business, rather than confusing.

The core of BASE: Basically Available (BA), Soft state (S), and Eventually consistent (E).

So why is it called a flexible transaction? It’s the opposite of ACID, it doesn’t have to be consistent, so if a rubber band gets bent, you let go of the rubber band, and it comes back to itself, that’s the flexible side of the rubber band.

3.2 What is the relationship between BASE and Taijiquan

Every move of Taijiquan is not straight out of the fight, each move is emphasizedsmooth,Draw arcWhat looks soft is actually soft with strength. Each move ends with a very rigid shake (which I can’t describe in words, so go watch TV). This last one can be thought of as the rigid side, which is the final consistency.

3.3 Basic Availability

How to understand basic availability? The point is that the basic, the theory doesn’t tell us how to define the basic, it’s a vague concept. In fact, it is to be soft to what extent.

In distributed systems, we can think of basic availability as ensuring that core functions are available and allowing the loss of some functions to be available. Basic availability can be achieved in four ways.

  • Flow peak clipping: for example, a number of seconds kill times, 8 seconds kill east, 12 seconds kill.
  • Delayed response: For example, if an order is created in a mall during Double 11, the customer will be prompted that the order is being created, which may take more than 10 seconds.
  • Experience degradation: For example, in a game, a large number of users entered the activity page to view the pictures. At this time, a large number of pictures could not be displayed because of network timeout. At this time, we can consider replacing the original pictures and returning pictures with lower definition or smaller pictures.
  • Overload protection: For example, if the message queue we commonly use is full, we can consider discarding subsequent requests or clearing some requests in the queue to protect the system from overload, but this needs to be designed according to our own business scenarios.

3.4 Final Consistency

Final consistency: All data copies in the system reach a consistent state after a period of synchronization. Ultimately, it can be interpreted as a short delay.

The ultimate consistency is adopted in a wide range of Internet services. But dealing with money or the financial system uses strong consistency or transactions.

What is the relationship between ACID’s strong consistency and its final consistency?

Strong consistency is actually a kind of final consistency. What about final consistency? Strong consistency can be seen as consistency without delay. Strong consistency is used if delays cannot be tolerated, final consistency is used otherwise.

3.5 What is the relationship between the final consistency and Tai Chi

One of the most amazing things about Tai ChiUnloading forceWhen the opponent attacks you with all his strength, use taiji moves to unload the opponent’s strength, so that the opponent’s attack is invalid. Unloading can correspond to the flow peak-cutting we talked about earlier. And when we’re done, it’s time for us to attack.

No trick wins

Back to the beginning of the article, Zhang Sanfeng taught Zhang Wuji taijiquan, but Zhang Wuji forgot all about it. How could he defeat xuanming?

Because taijiquan focuses on the meaning of the fist, not the moves. So Zhang Wuji understood the meaning of boxing, no move to win a move.

When we design distributed systems, we should not memorize the three major theories, but really understand the principles, and then we can iterate the most suitable distributed architecture for the current business system bit by bit.

Five, the summary

  • Taijiquan is divided into Yin and Yang aspects, just like THE C and A in CAP.
  • CAP theory is the basic theory of distribution, which has three important indicators: consistency, availability and fault tolerance of partition.
  • ACID is the traditional database design concept, the pursuit of strong consistency. Four indicators: atomicity, consistency, isolation and persistence. CP is an extension of CAP.
  • BASE theory is the result of a trade-off between consistency and availability in CAP. Is an extension of AP in CAP. Focus on availability and performance first, based on the scenario characteristics of the business, achieve elastic basic availability, and then achieve the ultimate consistency of data.
  • BASE theory, to a large extent, solves the power of transactional system in performance, fault tolerance, availability and other aspects.
  • BASE theory is widely used in NoSQL and is the theoretical support of NoSQL system design.

The paper also explained the core principle of two-stage submission through the case of six groups besieging Bright Top, and I believe that we must be able to understand. This article has been conceived for 2 weeks, and finally comes out, shamefully begging for a second look and forward ~