Distributed theory

Distributed system: a system in which hardware or software is distributed among computers on different networks, communicating or coordinating with each other through messages

1. Problems solved (shortcomings of single architecture)

Limited processing capability for mass users
The more complex the program, the less efficient the development
A major BUG occurs in the production environment that will cause all services to stop
As the amount of code increases, compilation efficiency decreases
There is only one technology stack to focus on

2. Explanation of nouns

Distributed – multiple people doing [different] things together
Cluster – a group of people doing the same thing together
Network partition (split brain) – The network is disconnected, causing small clusters in the distributed system. The network between small clusters is abnormal, and the network within a small cluster is normal

Third, architecture evolution

Single application Architecture
The application server is separated from the data server
Application Server Cluster
Database read/write separation
Add search engine to relieve library reading pressure
Add cache to relieve library read pressure
Database split horizontally/vertically
Application servers are split vertically
Application servers are split horizontally

4. Consistent classification

Strong consistency – what is required to be written and read by the system – performance impact
Weak consistency – There is no promise of how long it will take for data to be consistent, but at a certain level (seconds, minutes, hours)
Read and write consistency – The first time you see your updated content, others do not guarantee
- Specific content is read from the main library – the main library is under pressure
- The newly updated content is read from the master library and, after some time, from the slave library
Final consistency – only ensure that the data of all copies in the final system is correct

5. CAP theorem

Consistency – All copies are consistent and the data read from any node is the latest
Availability – Services provided externally are normal and no response timeout or error occurs
Fault tolerance – Can still provide services when network partitions are used

Only two of the three requirements can be met

Proof: The user makes a request to N1 to change the Value from V0 to V1. At this time, the network between N1 and N2 is interrupted. However, another user makes a request to N2 to obtain the Value

(1) return V0 to AP mode, sacrificing consistency

(2) Wait for the network to recover, and then return to V1 [CP mode, sacrificing availability]

(3) Combine N1 and N2 [CA mode, abandon distributed technology]

Sixth, BASE theory

If strong consistency cannot be achieved in the balance results of CAP theorem, final consistency should be reached in an appropriate way according to business characteristics

Basically Available – Allows for a partial loss of availability when a distributed system fails
- Time: normally within 0.5 seconds to respond to results, increased to 1~2 seconds in case of failure
- Function: When the traffic surge, some users will be directed to the degraded page
Soft State Soft state – Allows data to exist in intermediate states (some data has not been synchronized) but does not affect the overall system availability
Eventually consistent – Data is Eventually consistent after a period of synchronization

7. Consistency Protocol (handling database transactions)

1. 2PC Two-phase submission

process

Preparation phase – The coordinator sends a Prepare message to each participant, runs the local transaction but does not commit
Commit phase – The coordinator sends a Rollback message to the participant if the participant fails to run or times out. Otherwise, the coordinator sends a Commit message

disadvantages

Synchronous blocking – Participant transactions are blocked before phase 1 reaches phase 2
Single point of problem – If the coordinator crashes before running phase 2, the participant transaction is locked
Inconsistent data – The coordinator crashes before the Commit message has been sent, resulting in inconsistent data
Too conservative – If any node fails, the entire transaction fails

2. 3PC Three-phase submission

process

CanCommit – The coordinator sends a request containing the transaction to each participant, asking if it can be run
PreCommit – The coordinator asks the participant to run a transaction
DoCommit – The coordinator asks the participant to commit a transaction – The timeout defaults to commit a transaction if the participant does not receive a message from the coordinator at this stage

Reduced the transaction blocking range of 2PC, but did not completely solve the data inconsistency problem

8. Consistency algorithm (Selecting the final result or Leader)

1. The Paxos algorithm

role

Client Sends requests to distributed systems
Proposer: a Proposer that negotiates acceptors to reach an agreement
Acceptor decision makers – approve proposals
Learner – Learns the final decision

specification

An Acceptor must accept the first proposal it receives
Each proposal received must have the same value as the first
A proposal is selected and must be accepted by more than half of acceptors

** Process 1 **

Proposer sends a prepare request numbered N with no Value to more than half of the acceptors
If an Acceptor has not accepted the proposal, Null is returned
The Proposer then sends an accept request numbered N with a Value of its own Value
An Acceptor accepts a proposal with the number N and Value Value

Flow 2

Proposer sends a prepare request numbered N+1 with no Value to more than half of the acceptors
If an Acceptor has accepted a proposal numbered N, the Value of proposal N is returned
A Proposer then sends an accept request numbered N+1 with a Value
An Acceptor accepts proposals numbered N+1 with the Value Value

Extreme case: Two proposers submit successively numbered proposals, resulting in an endless loop

Solution: States that only a master Proposer can submit a proposal

2. The Raft algorithm

role

Leader – Interacts with the client, only one
Candidate – Responsible for nominating yourself during the election process and becoming the leader when the election is successful
Followers – voters, waiting to be informed of the vote

process

The election begins and all nodes are followers
Maintain Follower status if RequestVote (vote for me) and AppendEntries (elected Leader) requests are received
If no request is received within a period of time (random 150-300ms), the Candidate will change his identity to a Candidate and run for the Leader. If he gets more than half of the votes, he will become the Leader
If the Leader is not elected at last, Term + 1 starts the next round of election

Idempotence

Running multiple operations has the same impact as running an operation only once

1. Need idempotent case

The user submits the form repeatedly
Users repeatedly cast malicious votes
The interface retried the request due to timeout
Messages are re-consumed

Unknown problems with the system caused by retries must be avoided

2. Solutions

Database field unique
- Apply to the operation
  - add
  - delete
- limit
  - The data table must contain unique fields
- Operation process
  1. The upstream service generates the ID (field unique), which is passed in when the downstream service is invoked
  2. The downstream service reported an error when adding data if the data already existed
Optimistic database locking
- Apply to the operation
  - update
- limit
  - Additional fields need to be added to the data table
- Operation process
  1. The upstream service will pass the current version value of the record to the downstream service
  2. Before modifying the record, the downstream service checks whether the version matches, updates the record after the match is successful, and adds the version value to 1
Token Token
- Apply to the operation
  - add
  - update
  - delete
- limit
  - Redis is used for verification
- Applicable scenario
  - User submits order
- Operation process
  1. When the database record is added successfully, the user’s Token value is used as the Key of Redis, and a record with a lifetime (10 seconds) is inserted into Redis.
  2. When adding data to the database, look for Redis, and if the record exists, it indicates repeated addition
Unique serial number
- Apply to the operation
  - add
  - update
  - delete
- limit
  - Redis is used for verification
- Applicable scenario
  - The service is invoked indirectly
- Operation process
  1. The upstream service generates a unique serial number, which is stored in Redis as a Key, and passes the serial number to the downstream service
  2. Before adding data, the downstream service checks whether there is a record in Redis. If there is a record, it indicates that it is added for the first time, and deletes the record in Redis after adding data

X. Distributed system design strategy

1. Heartbeat detection

Usually carries status and metadata information for easy management

Periodic heartbeat detection – Response time out, dead
Cumulative failure detection – Initiates a limited number of retries on a dying node

2. High availability

Active/Standby mode [Common] n/A When the host is faulty, the standby host takes over all the work of the host. N/A After the host is recovered, the services are switched back to the host in automatic (hot backup) or manual (cold backup) mode, such as MySQL and Redis
Mutual backup mode (multi-master) – Two hosts run simultaneously and monitor each other
Cluster mode – multiple nodes run at the same time and share requests through the primary node – high availability of the primary node is addressed

3. Load balancing

The solution

Hardware – F5
Software – LVS, HAProxy, Nginx

strategy

random
polling
The weight
The minimum connection
IP hash

11. Network communication

1. RPC

Remote Procedure Call Remote Procedure Call

role

Client Client – Service caller
Client Stub – Packages Client requests into network messages and sends them to the Server Stub over the network
Server Stub – Receives the Client Stub message, unpacks the message, and invokes the local method
Server Server – Service provider

2. RMI

Remote Method Invocation

Java native support for communication between different Java virtual machines

role

The client

Stub Stub – Client proxy
Remote Reference Layer – Parses and runs the Remote Reference protocol
Transport Transport layer – Calls remote methods to receive run results

The service side

Transport – Receives client requests and forwards them to the Remote Reference Layer
Remote Reference Layer – Call the Skeleton method
Skeleton – Calls the actual method

Registry – Registers remote objects with a URL and replies a reference to the remote object to the client

Twelve, IO

1. Noun explanation

zhuanlan.zhihu.com/p/22707398

Block – After a call is made, it is immediately suspended until the system reads the data and returns it to the application layer
Non-blocking – Returns immediately after the call is made
Synchronization – Actively asks the system kernel if the data has been read
Asynchronous – do not ask the system kernel, but the system kernel read after completion, active notification

Type 2.

BIO synchronously blocks IO – one thread at a time
- simple
- High resource overhead
NIO synchronous non-blocking IO – connect one channel at a time – registers the connection to a multiplexer, which polls the channel and starts thread processing only when there is data in the channel
AIO asynchronous non-blocking IO – connection operations are delegated only to the OS, which is responsible for notifying the service and calling the service completion callback

Chat 🏆 technology project stage v | distributed those things…

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Distributed theory knowledge map in Java stop division 】【 | 🏆 fifth edition essay technology projects

Distributed theory

1. Problems solved (shortcomings of single architecture)

2. Explanation of nouns

Third, architecture evolution

4. Consistent classification

5. CAP theorem

Sixth, BASE theory

7. Consistency Protocol (handling database transactions)

1. 2PC Two-phase submission

2. 3PC Three-phase submission

8. Consistency algorithm (Selecting the final result or Leader)

1. The Paxos algorithm

2. The Raft algorithm

Idempotence

1. Need idempotent case

2. Solutions

X. Distributed system design strategy

1. Heartbeat detection

2. High availability

3. Load balancing

11. Network communication

1. RPC

2. RMI

Twelve, IO

1. Noun explanation

Type 2.

Distributed theory knowledge map in Java stop division 】 【 | 🏆 fifth edition essay technology projects

Distributed theory

1. Problems solved (shortcomings of single architecture)

2. Explanation of nouns

Third, architecture evolution

4. Consistent classification

5. CAP theorem

Sixth, BASE theory

7. Consistency Protocol (handling database transactions)

1. 2PC Two-phase submission

2. 3PC Three-phase submission

8. Consistency algorithm (Selecting the final result or Leader)

1. The Paxos algorithm

2. The Raft algorithm

Idempotence

1. Need idempotent case

2. Solutions

X. Distributed system design strategy

1. Heartbeat detection

2. High availability

3. Load balancing

11. Network communication

1. RPC

2. RMI

Twelve, IO

1. Noun explanation

Type 2.

Related Posts

Java Test Framework series: PowerMock uses Series 4: Test listeners

Recommend a project management tool for agile development based on Scrum!

Explain the underlying architecture logic of the data center in detail

Distributed theory knowledge map in Java stop division 】【 | 🏆 fifth edition essay technology projects