This is the 24th day of my participation in Gwen Challenge.

Hello, I am Wukong.

Star: github.com/Jackson0714… Personal website: www.passjava.cn

Hello, I am Wukong.

Main contents of this paper:

In the last three months, I updated 8 articles on distributed theory and algorithm. I found that although these knowledge points are a little boring, they are very important. Therefore, I explained each article in the way of stories, and tried to make everyone understand each article.

In the process of writing, I slowly understand the essence of distributed theory and algorithm, listen to the breakdown below.

The importance of

Where is the importance of distributed theory and algorithms?

How often have you heard that CA, AP, CAP triangles can’t be achieved at the same time?

How does Zookeeper elect its Leader?

How do you do fault tolerance?

The answers to these questions are the essence of distributed theory and algorithms. And let’s say a very practical question, at this stage, the knowledge of distributed algorithms is also going to interview architects, technical experts and other high-paying jobs, is often asked. But only a minority of job seekers understand this.

Why do I emphasize the importance of distributed algorithms? Whether it’s the pursuit of technical depth, or the need for long-term career development, it can improve your core competencies.

Studious?

Nowadays, many developers have some experience in how to use distributed components and know the general meaning of CAP theory and BASE theory. But very few people really look at distributed algorithms, for three reasons:

  • Worried that the algorithm is too complex, so it takes very little time.

  • There are fewer resources on the web that can articulate distributed algorithms in plain English.

  • There is no clear path to learning distributed algorithms.

The way to learn distributed protocols and algorithms is to learn four basic theories as a foundation. Then you learn about distributed protocols and algorithms. It is like building a house on a foundation. When the foundation is well laid, a more stable high-rise can be built.

The distributed theory has four major parts:

Four basic theories

  • Question of the Byzantine general

  • Theory of CAP

  • Theory of ACID

  • The BASE theory of

There are eight types of distributed protocols and algorithms:

Eight distributed protocols and algorithms

  • Paxos algorithm

  • Raft algorithm

  • Consistent Hash algorithm

  • Gossip protocol algorithm

  • Quorum NWR algorithm

  • FBFT algorithm

  • POW algorithm

  • ZAB agreement

How to learn and master efficiently?

The key to develop distributed system is to choose the appropriate algorithm according to the characteristics of the scene and compromise between consistency and availability. How to do a good compromise depends on whether you really understand the characteristics of each algorithm.

To be honest, if you study these theories and algorithms carefully, and understand the characteristics of each algorithm, which is suitable for the scenario, when developing a distributed system, do know yourself and know your opponent, then you can start to win, and the problems in the actual scene can be analyzed clearly and solved.

So what are the characteristics and applicable scenarios of these algorithms, and what aspects should be considered?

Four dimensions of distributed algorithms

Four dimensions: Byzantine fault tolerance, consistency, performance, availability.

Here I made a table of four dimensions of distributed algorithm, you can compare:

Byzantine fault tolerance

Byzantine fault tolerance is a model proposed in The Problem of the Byzantine General, which describes a completely implausible scenario. Untrustworthiness is reflected in:

  • Failure behavior. Let’s say a node fails.

  • Malicious behavior. For example, malicious nodes pose as normal nodes and issue incorrect instructions.

The other side of Byzantine fault tolerance is non-Byzantine fault tolerance, also known as fault tolerance, which solves the problem of distributed system failure but no malicious node consensus, such as node hardware failure, node service process crash, etc.

Non – Byzantine fault – tolerant algorithm

In a trusted environment, only fault tolerance is required, such as 2PC, TCC, Paxos algorithm, Raft algorithm, Gossip protocol, Ouorum NWR algorithm, ZAB protocol.

Byzantine fault tolerance algorithm

But in the untrusted environment, need to have the Byzantine fault tolerance ability, error POW algorithm, FBFT algorithm.

consistency

There are three types of consistency:

  • Strong consistency: After the write operation is complete, any subsequent access can read the updated value.

  • Weak consistency: After the write operation is complete, the system cannot ensure that subsequent accesses can read the updated value.

  • Final consistency: Guarantees that if there are no new writes to an object, eventually all subsequent accesses will read the same recently updated value.

At the database operation level, we mostly use the two-stage commit protocol (2PC) to ensure strong consistency. In distributed systems, Raft algorithms are often used to ensure strong consistency. If availability is considered, the Gossip protocol is used for final consistency, along with the Quorum NWR algorithm’s three parameters to regulate fault tolerance. Zookeeper uses the ZAB protocol to provide final consistency based on read performance.

availability

Availability means that response data is available, but the data is not guaranteed to be up to date. Service availability is emphasized. Prerequisites: Access a non-faulty node.

The most usable protocol is the Gossip protocol, even if there is only one node, the cluster can provide services. Then there is the Paxos/Raft/ZAB/Quorum NWR/FBFT/POW algorithm that can tolerate partial node failures.

However, 2PC and TCC require that all nodes run properly.

performance

Performance and availability are closely related, with higher availability leading to better performance.

The availability ranking above also applies to the performance dimension. The Gossip protocol can be used in AP distributed systems with strong horizontal expansion capability and the highest read and write performance.

Paxos/Raft/ZAB/Quorum NWR/FBFT/POW algorithms are all leader models, write performance depends on the leader, and read performance depends on the implementation of consistency. Performance is middling.

However, when 2PC and TCC implement transactions, resources need to be reserved and locked, resulting in poor performance.

Learning path

Lecture 1: The Problem of the Byzantine generals

Must first understand the Byzantine general problem, this article I use the killing of The Three Kingdoms card game in the four identity cards to explain the Byzantine general problem.

Using the Killing of The Three Kingdoms to talk about distributed algorithms, comfortable?

Lecture 2: CAP, BASE, ACID theory

It is an explanation of the three major theories of CAP, BASE and ACID. In this paper, I use the strength and softness in Taijiquan to compare ACID and BASE, and how to balance the strength and softness needs CAP theory.

Taijiquan distributed theory, really comfortable!

Lecture 3: Paxos algorithm

In order to better understand the Paxos algorithm, I used Zhuge Liang and Pang Tong in the Romance of The Three Kingdoms as the proposers to analyze the details of the Paxos algorithm.

Zhuge Liang VS Pang Tong, win distributed Paxos

Lecture 4: Raft Algorithm

Raft is actually pretty easy to understand, but it would be confusing to describe it directly, so I used giFs to simulate the Raft election process and it was easy to understand.

Using GIFs to Explain Distributed Raft

Lecture 5: Consistency hashing

This is one of the distributed algorithms commonly used in load balancing, routing and addressing. This algorithm is not difficult to understand, but rather boring, so I use the story of Han Xin’s soldiers to explain, humorous and interesting.

“Han Xin Big Trick: Consistency Hashing”

Lesson 6: The Gossip protocol

The English word Gossip is Gossip, infectious, so I use an infectious virus story to explain, can not only learn distributed algorithms can understand virus knowledge, kill two birds with one stone.

Virus Intrusion: All by distributed Gossip Protocol

Lecture 7: Quorum NWR

The three parameters of N, W and R are quite obscure. In order to make it easier for everyone to understand, I use Taishanglaojun’s refining furnace as the node, the drugs in the furnace as the data, and the manufacturing process to reflect the three parameters of NWR, which are more visualizing.

Distributed Quorum NWR for Tai Shang Lao Jun’s Alchemy

Lecture 8: POW algorithm

When LEARNING POW algorithm, I was involved in the knowledge of blockchain, so I went to read a book on blockchain, Blockchain: From Digital Currency to Credit Society, a popular science book, which was very helpful for me to understand blockchain and bitcoin. One of the core pieces of knowledge used in blockchain is the POW algorithm, also known as proof of work. I explained block chain, bitcoin and proof of work with the story of Zixia Fairy and Supreme Treasure, which was humorous and interesting.

Zixia Fairy: Can we withstand the twelve Questions of blockchain?

By the way, THE FBFT and ZAB protocols have not been explained to you, and it may take some time to make up for it. Because many readers are urging me to update my open source project PassJava, I have to change the open source project.

In addition, I organized these eight articles into a PDF document with a total of 2W+ words, available on Github at github.com/Jackson0714

About the author: Wu Kong, 8 years of experience in Internet development and architecture, explains distributed, architecture design, Java core technology with stories. “JVM performance optimization practice” column author, open source “Spring Cloud practice PassJava” project, independently developed a PMP brush small program.

I am Wukong, strive to become stronger, become super Saiya people!