This is my 43rd original article

Let me start with a little story. There’s this amazing guy, Leslie Lambert, who won the Turing Award. He has been studying the distributed domain. The problem of consistency in distributed environments has always been difficult to solve, and this guy wrote a paper called “The Part-time Parliament”. In this paper, a Paxos algorithm is mentioned to solve the problem of data inconsistency in distributed environment because the machines in the cluster may go offline at any time. But this guy was so weird that he took an algorithm paper and turned it into a story — as you can tell from the title — without a single mathematical notation in it. As a result, leading academic journals refused to accept his papers at all.

Those interested can reply in the background: Paxos, read this 11-page algorithm paper with no formula.

The function of the distributed consistency protocol is to make all nodes reach a consensus when the network is abnormal. All the nodes in the cluster can agree on certain information as long as they pass this protocol, no matter what the network anomaly is.

The distributed consistency protocols that are widely used at present are:

  • Paxos
  • ZBA
  • Raft

Paxos agreement

The Paxos protocol requires several roles to handle corresponding tasks:

  • Proposed the Proposer
  • Policymakers Acceptor
  • Learners Learner

We have three roles in each node at once. When the data/opinions of different nodes are inconsistent, the three roles will agree by:

  • Prepare Proposal stage: the proposer makes a proposal to the decision maker.
  • Accept decision stage: the decision maker accepts the proposal and informs all learners.

Scenario hypothesis: Now three generals, Liu Bei, Guan Yu and Zhang Fei, are in charge of the three armies. They have each received a secret message from their strategist zhuge Liang, asking them to attack. Liu Bei and Guan Yu received the message to attack Cao Cao, while Zhang Fei received the message to attack Zhou Yu. What can I do? Note that the transmission is unstable and the messenger may be killed.

Prepare proposal stage

  • Prepare proposal stage:

    More than half of Acceptor decision makers at all Acceptor nodes initiate a proposal with a numbered number.

    Sun Gan, who was following Liu Bei, sent a secret letter 666 to liu Guan zhang’s three eldest brothers at the same time. Mi Zhu also sent it, the number is 555, fazheng’s number is 111.

    Proposal Phase 2: All Acceptor decision makers respond to the proposal with the highest number.

    In the meantime, two messengers failed, so Liu Bei received secret letters from Sun Gan, Mi Zhu and Fa Zheng, Guan Yu received letters from Sun Gan and Fa Zheng, and Zhang Fei received letters from Mi Zhu and Fa Zheng. Then each hand on the largest number back a letter: received.

Accept decision stage

  • Accept decision stage:

    A Proposer that receives a response to a proposal from more than half of its acceptors sends a numbered proposal and its content to more than half of its acceptors.

    Of sun Gan, Mi Zhu and Fa Zheng, only Sun Gan received more than one version of the feedback, so he sent the information he had received to Liu Guan Zhang again. Unfortunately, only Guan Yu received it this time.

    Decision Phase 2: Only the decision Acceptor received a decision proposal with the maximum number previously received should accept it. At the same time, he needs to tell all the learners. \

  • Guan Yu received the proposal that he had responded to earlier. It was a message from Sun Gan to attack Cao Cao. He accepted the proposal and informed all the learners (soldiers) of the proposal. At this time, all nodes on the attack who reached an agreement, happy to attack Cao Cao.

Some people say, how so complicated ah? Yes, Paxos is notoriously complex, and this is already a super-simplified version. Liu’s example is a Chinese translation of “the Byzantine General problem.”

But it was one of the most widely distributed consistency solutions available at the time and to date. It solves a super difficult problem: how to make all nodes in the cluster reach a consensus in the case of abnormal information transmission (machine hang, message delay, loss, repetition, disorder, etc.). Based on the basic logic above, along with various qualifying rules, Paxos ensures that the entire distributed system can reach a consensus on information, regardless of any exceptions that occur within the cluster.

But Paxos is too laborious, too difficult to understand, and too many nodes, too much trouble.

Is there a more optimal solution? Yes, I will share the ZAB agreement with you next time.

Paxos, read this 11 page algorithm paper with no formula

Highlights from the past

Dry goods | said through distributed data consistency issues

The underlying rules of data driven business | theory of “force”

Hot article | system of big data engineer career path whole solution

Help me to click “like” + “watching”, this year’s wages are rising!