1. Distributed development process

1.1 Single point centralized

** Features: **App, DB, FileServer are all deployed on one machine. And less access requests

1.2 Application Service and Data Service Separation

** Features: **App, DB, FileServer are deployed on an independent server. And less access requests

1.3 Using caching to improve performance

** Features: ** Frequently accessed data in the database is stored in the cache server, reducing the number of database access, reducing the database pressure

1.4 Application Server Cluster

** Features: ** Multiple application servers provide external services simultaneously through load balancing, which solves the problem of upper processing capacity of a single server

1.5 Database read/write Separation

** Features: ** Database for read and write separation (master and slave) design, to solve the database processing pressure

! [](data:image/svg+xml; utf8,)

1.6 Reverse proxy and CDN acceleration

** Features: ** Uses reverse proxy and CDN to speed up system access

1.7 Distributed File Systems and Distributed Databases

** Features: ** Database uses distributed database, file system uses distributed file system

With the development of services, read and write separation of databases cannot meet the requirements. Distributed databases and distributed file systems are required to support the separation

Distributed database is the last method after database splitting. It is only used when the size of a single table is very large. The more commonly used method of database splitting is service splitting, deploying databases of different services on different machines

Second, distributed technology details

1. The concurrency

2. Distribution

Large tasks are divided into multiple tasks and deployed on multiple machines to provide external services

3. Lack of a global clock

Time should be consistent

4. The equivalence

A service deployed on multiple machines is the same without any difference

5. Breakdowns are bound to happen

The hard disk is down and the CPU is burned….

Distributed transactions

1. ACID

** Atomicity: ** All operations in a transaction either complete or not complete, and do not end at some intermediate stage. If a transaction fails during execution, it will be rolled back to the state before the transaction began, as if the transaction had never been executed. ** Consistency: ** The integrity of the database is not compromised before and after a transaction. This means that the data written must conform to all the preset rules, including the accuracy of the data, the concatenation of the data, and the ability of the subsequent database to do its predetermined work spontaneously.

For example, A has 500 yuan, B has 300 yuan, A transfers 100 yuan to B, no matter what, the sum of A and B is always 800 yuan ** Isolation: ** The ability of a database to allow multiple concurrent transactions to read, write and modify its data at the same time. Isolation prevents data inconsistencies due to cross-execution when multiple transactions are executed concurrently. Transaction isolation can be divided into different levels, including Read uncommitted, Read Committed, Repeatable Read, and Serializable. ** Durability: ** Changes to data are permanent after transactions, they are not lost even if system failures.

2. 2P/3P

2P= Two Phase commit (RDBMS (relational database management system) is often this mechanism, to ensure strong consistency)

3P= Three Phase commit

**2P/3P is to ensure the ACID of the transaction (atomicity, consistency, isolation, persistence).

2.1 Two stages of 2P

Phase 1: Commit transaction request (vote phase) asks if the transaction can be committed

Phase 2: Perform transaction commit (COMMIT, rollback) the actual commit transaction

2.2 Three stages of 3P

Stage 1: Commit or not – Ask if transaction commit can be done Stage 2: Pre-commit – Pre-commit transaction Stage 3: Perform transaction commit (COMMIT, rollback

Explanation: 3P splits stage 1 of 2P into the first two stages

3. The theory of the CAP

Consistency: Data in a distributed database is consistent

Availability: When a node fails, other nodes can continue to provide services

A new machine can be added and backup data can be synchronized from another normal machine to another one if the machine accommodating the database fails, e.g. the hard disk fails and data is lost

The characteristics of CAP theory: CAP can only satisfy two of them

CA(discard P) : Puts all data on one node. Ensure consistency and availability. AP(give up C) : Give up strong consistency and guarantee final consistency. CP(abandon A) : Once the system encounters A fault, the affected server needs to wait for A period of time and cannot provide services during the recovery period.

To illustrate CAP theory:

Machine1-db1-tbl_person and TBL_ORDER machine2-DB2-tBL_person and TBL_ORDER Alter table tBL_person, tBL_order, machine2, tBL_order, machine2, tBL_order This is consistency 2) when one of the machines goes down, you can continue to provide services, and when you restart the machine, you can continue to provide services, and this is availability 3) When machine1 goes down, all the data is lost, there’s no problem, because there’s still data on Machine2 and Machine3, Add another machine, Machine4, and synchronize the backup data from one machine, Machine2 and Machine3. This is partition fault tolerance

4. The BASE theory

** Basic availability: ** Failure of a distributed system that allows a partial loss of availability (service degradation, page degradation) ** Soft state: ** Allows distributed systems to have intermediate states. And the intermediate state does not affect the availability of the system. The intermediate state here refers to the delayed final consistency of data updates between different data replication. For example, in CAP theory, when inserting numerical data into tables TBL_person and TBL_ORDER of Machine1’s DB1, At the same time, the inserted data should be synchronized to Machine2 and Machine3. When the network of Machine3 has a problem, the synchronization fails, but after a while the network recovers, the synchronization succeeds. This synchronization failure state is called soft state, because the synchronization succeeds eventually. ** Final consistency: ** Data Replications achieve consistency over time.

5. Paxos algorithm

5.1 Before introducing the Paxos algorithm, let’s take a look at a short story

Question of the Byzantine general

Byzantium was the Eastern Roman Empire from the 5th century to the 15th century. Byzantium is now Istanbul, Turkey. As we can imagine, the Byzantine army had many branches, stationed outside the enemy’s cities, and each branch was commanded by its own general. Suppose there are 11 generals, and the generals can only communicate with their correspondents. After observing the enemy, loyal generals must draw up a unified plan of action — attack or retreat. However, there were traitors among the generals who did not want the loyal generals to reach an agreement that would affect the formulation and dissemination of a unified plan of action. The problem was this: the generals had to have an agreement that all the loyal generals could agree on, and a few traitors could not make the loyal generals make the wrong plan-make some generals attack and others retreat. If there were nine loyal generals, five who judged to attack, four who judged to retreat, and two spies who judged to retreat maliciously, this would be perfectly permissible, even if it turned out to be a false retreat. Because these 11 generals are still consistent.

! [](data:image/svg+xml; utf8,)

Summary: 1) 11 generals attack the city 2) attack at the same time (motion, resolution), retreat at the same time (motion, resolution) 3) Whether to retreat or attack, only half of the generals agree to carry out 4) there are traitors among the generals, will interfere with the formation of the resolution

5.2 The Paxos algorithm is introduced below

Mike Burrows, author of Google Chubby, has said that there is only one consistency algorithm in the world and that is Paxos, and the rest of them are inferior.

Paxos: Majority resolution (final resolution of consistency)

The Paxos algorithm has three roles: Proposer, Acceptor, and Learner

**Proposer: **Proposer

Submit a motion (judge whether it is more than half), submit a motion for approval (judge whether it is more than half)

**Acceptor: **Acceptor

(a) accept a proposal or reject a proposal with a proposer

**Learner: **Learner

If the motion arises, study the motion.

Setting 1: If an Acceptor does not accept the proposal, it must accept the first proposal

Setting 2: Each motion must have a number, and the number can only be increased, not repeated. It gets bigger and bigger.

Setting 3: accept a proposal with a larger number. If the number is smaller than the number previously accepted, reject the proposal

Setting 4: There are 2 types of bills (bills submitted, bills approved)

! [](data:image/svg+xml; utf8,)

1) Prepare stage (proposal submission)

A) Proposer seeks proposals V. First issue a Prepare request to most acceptors. The Prepare request contains the sequence number K

B) After receiving the Prepare request number K, check whether the Acceptor has processed the Prepare request before.

C) If an Acceptor has not accepted any Prepare requests, it responds to the proposals with OK, indicating that an Acceptor must accept the first proposal it receives.

D) Otherwise, if an Acceptor has previously accepted any Prepare request (such as MaxN), it compares the proposal number and responds to the Proposer with reject or error if K<MaxN

E) If K>=MaxN, then check if there were any previously approved proposals and if not, respond to the Proposer with OK and record K

F) if K>=MaxN, then check if there is any approved proposal before, if so, reply to the approved proposal number and content (e.g. <AcceptN, AcceptV>, AcceptN indicates the approved proposal number, AcceptV indicates the approved proposal content)

2) Accept (approval)

A) Proposer receives an OK response from more than half of its acceptors without any proposal number or proposal content it has approved. A Proposer then continues to submit an approval request with a proposal number K and a proposal content V (<K, V>)

B) Proposer receives an OK response from more than half of its acceptors with an approved proposal number and proposal content (< POK, proposal number, proposal content >). A Proposer then sends more than half of the responses (suppose < POK, AcceptNx, AcceptVx>) to acceptors as a request to submit an approval (request <K, AcceptVx>).

C) if no response is received from more than a majority of Acceptors, the proposal number K is numbered K+1 and the number is sent to the Acceptors again.

D) If an Acceptor receives a Proposer with a numbered K<MaxN, it does not respond or reject it.

E) If an Acceptor receives an Accept request from a Proposer numbered K>=MaxN, it approves the proposal and sets the number of proposals it approves to <K, Accept the proposals numbered numbered, Accept the proposals numbered numbered numbered, and responds to the proposals numbered numbered K.

F) If the Proposer receives a majority of Accept responses over a period of time, it terminates the process and informs the Leaner that it can study the proposals.

G) If no more than a majority of proposals are received after a period of time, the Proposer changes the number of proposals and resumes the Prepare phase.

5.3 Paxos sample

Example 1: Sequential proposal scenario

! [](data:image/svg+xml; utf8,)

Role:

Proposer: Staff 1, staff 2

Acceptor: General 1, General 2, General 3 (Decision maker)

1) Staff officer 1 initiates a proposal to send a message to the three generals (No. 1); 2) The three generals received advice from staff 1 to save (No. 1) so as not to forget; At the same time, send the signalman back with a message (OK); 3) After receiving the reply from at least two generals, staff 1 will send another signal to the three generals with a message (no. 1, attack time 1). 4) Three generals received the time of staff 1 and saved it (number 1, attack time 1) to avoid forgetting; At the same time, the signalman was sent back with the message “Accepted”. 5) Staff 1 receives the Accepted content from at least 2 generals, confirming that the attack time has been Accepted by all; 6) Staff officer 2 makes a proposal to send a message to the three generals (No. 2); 7) The three generals received a proposal from staff Officer 2, and kept (No. 2) because (No. 2) was larger than (No. 1) to avoid forgetting; Since he had already accepted the proposal of staff 1, he sent a message to the signalman (number 1, attack time 1). 8) Staff 2 receives replies from at least two generals. As the replies contain the accepted proposals of Staff 1, Staff 2 does not propose a new attack time and accepts the time proposed by Staff 1;

Example 2: Crossover scenario

! [](data:image/svg+xml; utf8,)

Role:

Proposer: Staff 1, staff 2

Acceptor: General 1, General 2, General 3 (Decision maker)

1) Staff officer 1 initiates a proposal to send a message to the three generals (No. 1);

2) The situation of the three generals is as follows: a) General 1 and General 2 receive the proposal from Staff 1, and General 1 and General 2 record it (Number 1). If other staff propose a smaller number, it will be rejected; At the same time, send the signalman back with a message (OK); B) The signal soldier in charge of informing General 3 was captured, so General 3 did not receive the proposal from Staff 1;

3) Staff Officer 2 also made a proposal at the same time to send a message to the three generals (No. 2). 4) The situation of the three generals is as follows: a) General 2 and General 3 receive the proposal from Staff 2, and general 2 and General 3 record (Number 2). If any other staff proposes a smaller number, it will be rejected; At the same time, send the signalman back with a message (OK); B) The signal soldier in charge of informing General 1 was captured, so General 1 did not receive the proposal from General 2; 5) After receiving the replies from at least two generals, staff 1 will send another signal to the two generals with a message (no. 1, attack time 1). 6) The situation of the two generals is as follows: a) General 1 received (No. 1, attack time 1) and saved (No. 1, attack time 1). At the same time, the signalman was sent back with the message “Accepted”. B) General 2 received (Rejected, Attack time 1), and sent a message to the signalman (Rejected, Ref. 2) because (Rejected, ref. 1) is less than (Rejected, Ref. 2); 7) After receiving the replies from at least two generals, Staff Officer 2 will send another message to the two generals with the contents (No. 2, attack time 2). 8) General 2 and General 3 received (Number 2, attack time 2), which was the same as the number they kept, so they saved (Number 2, attack time 2) and sent the signal back with a message saying “Accepted”; 9) Staff 2 receives the Accepted content from at least 2 generals confirming that the attack time has been Accepted by the majority;

Staff 1 Accepted (Accepted) and Rejected (Rejected); Staff Officer 1 renewed the proposal and sent the signalman with a message to the three generals (no. 3);

11) The situation of the three generals is as follows: a) General 1 received a proposal from Staff 1, and as (No. 3) was larger than the one previously saved (No. 1), he kept (No. 3); Since General 1 had accepted the previous proposal of General 1, he sent the signal soldier back with a message (Number 1, attack time 1). B) General 2 received a proposal from Staff Officer 1 to save (No. 3) as it was larger than that previously saved (No. 2); Since General 2 had accepted the offer of Staff 2, the signalman was sent back with a message (Number 2, attack time 2); C) The signalman responsible for informing General 3 is captured, so General 3 does not receive the proposal from Staff 1;

12) Staff 1 received at least two replies from generals, compared the numbers of the two replies, and selected the attack time corresponding to the large number as the latest proposal; Staff Officer 1 again sent the signalman to the two generals who had replied with a message (no. 3, attack time 2); 13) General 1 and General 2 received (Number 3, attack time 2), which was the same as the number they kept, so they saved (Number 3, attack time 2), and sent the signalman back with the message saying “Accepted”; 14) Staff 1 received the accepted content from at least 2 generals confirming that the attack time had been accepted by the majority.

Zookeeper ZAB protocol

Zookeeper Automic Broadcast(ZAB) is a classic implementation of Paxos

Terms:

Quorum: a collection of more than half of a cluster

1. ZAB(ZooKeeper) nodes have four states

** Looking: ** Elects the Leader state (in crash recovery state)

**following: the state of ** followers who obey the command of the Leader

**leading: ** The current node is the Leader, responsible for coordination.

**observing: **observer, a read-only node that does not participate in the election.

2. Two modes in ZAB (how ZK conducts elections)

Crash recovery, message broadcast

1) Crash recovery

The leader hangs, and a new leader needs to be elected

A. Each server has a vote

, such as (3,9), which votes for itself. B. After each server votes for itself, the server votes for other available servers. If (3,9) of Server3 is assigned to Server4 and Server5, the analogy is c. Zxid = myID = Zxid = myID = Zxid When comparing the Zxid, the leader is the leader. Change the server state (crash recovery -> data synchronization, or crash recovery -> message broadcast)
,>

Supplementary notes on related concepts:

Epoch cycle value

AcceptedEpoch: the follower has accepted the newepoch proposal from the leader to change the age number.

CurrentEpoch (metaphor: current year) : Current year

LastZxid: the latest proposal zxID received in history (maximum value)

History: log of the current node receiving transaction proposals

Zxid data structure:

cZxid = 0x10000001b

64-bit data structures

High 32 bits: 10000

The combination of the Leader cycle number + myID

Lower 32 bits: 001B

The incrementing sequence (monotonically increasing sequence) of a transaction is +1 whenever the client requests it

When a new Leader is created, the maximum transaction Zxid in the local log is fetched from the Leader server, the epoch+1 is read from it as a new epoch, and the lower 32 is 0 (to ensure the id is definitely self-incrementing).

2) Message broadcast (similar to 2P commit)

A. Lenader accepts the request and assigns the request to the globally unique 64 bit incremented Id (zxID). B. Send the ZXID as a proposal to all followers. C. After receiving the motion, all the followers want to write the motion to the hard drive and send the Leader an ACK (OK) immediately. D. When the Leader receives more than half of the valid Acks, the Leader sends the commit command to all followers. E. follower Run the commit command. Note: At this stage, the ZK cluster is officially available and the Leader can broadcast messages. If a new node joins, synchronization is required.

3) Data synchronization

A. Obtain the Leader’s maximum lastZxid (from the local log) b. Find the data corresponding to the zxID and synchronize it (the data synchronization process ensures that all followers are the same) c. Only when quorum synchronization is complete can a quasi – Leader become a true Leader