Distributed systems, how much do you know?
- One, foreword
- Second, distributed system
-
- 1, an overview of the
- 2. Segmentation method
-
- (1) Horizontal segmentation method
- (2) Vertical segmentation method
- (3) Mixed segmentation method
- 3. Facing problems
- 4. Measurement criteria for distributed systems
- Three, distributed system design principles
-
- 1. CAP principles
- 2. BASE Theory
One, foreword
In 2011, the Internet Society of China announced that China has the largest number of Internet users in the world, and with the development of the Internet in recent years, the number of Internet users in China has reached more than 900 million (see figure below). And with that comes informationBig dataTime, and in order to cope with the hundreds of millions of visitors,High concurrencyIs that enough for today’s apps? There is a survey results show that if the page load time is more than 3 seconds, there will be more than 50% of users choose to leave, naturallyQuick responseYou can’t be absent.
Under the requirements of big data, high concurrency and fast response, the stand-alone system can no longer meet the business needs of the Internet today, so how to solve it? As hardware resources become cheaper, today’s application system has developed from a single machine system to a multi-machine collaborative system, and we call such a multi-machine collaboration to complete the business requirements of the system is called a distributed system. Although distributed system development becomes more complex and difficult, but with the development of distributed systems and pop, combining previous development experience and ideas, we stand on the shoulders of giants puts forward the concept of micro service architecture (here only specify a distributed system, micro service concept in detail again next), and as for what is a distributed system? Watch below.
Second, distributed system
1, an overview of the
In order to make you better understand what is distributed system, we give a simple distributed framework (as shown below). First, the user uses PC or mobile terminal to send a request to the website through the Internet. After the request reaches the server, the routing algorithm in the gateway will send the request to the corresponding Web server, so as to complete the corresponding business requirements. This is how a simple distributed system works. Although it has many problems, it does not affect the theory it conveys.
So what is a distributed system? Distributed systems are
. It uses the idea of divide and conquer, through the use of a number of cheap ordinary machines to carry out independent calculations, and then through collaborative work to complete the task, while the distributed system also meets today’s Internet requirements for big data, high concurrency, fast response. Having said all that, what are its advantages?
-
** High performance **
: Because a large number of requests are reasonably allocated to each node, so that the pressure on each Web server is reduced, and multiple requests can be processed by multiple machines, so more requests and data can be processed, and the performance is naturally higher than that of a stand-alone system. This solves three key problems in today’s Internet system
“Big data, high concurrency, fast response”.
-
** High availability **
: Automatically avoids faulty nodes and continues to provide services. On a stand-alone system, a mechanical failure may render the site unusable. But in a distributed system, if a computer node fails, the system will automatically cull that node and no longer forward requests to that node, while the system can still be used.
-
** Scalability **
: When the performance of the existing machines can not meet the business development, we need more machines to provide services. By modifying the routing algorithm, it is possible to route to new machines, thus accommodating more machines into the system and continuing to meet the requirements of big data, high concurrency, and fast response. Conversely, when the existing machinery is much more than needed, the number of machines can be reduced, thereby saving costs.
-
** Maintainability **
: If one of the devices cannot provide services for some reason, for example, if the device is faulty, you only need to stop the faulty nodes, handle them, and get them online again.
-
** Flexibility **
: When the system needs to update, only need in off-peak, stop some nodes, these nodes updated to the latest node, and then through the routing algorithm to route requests to these updated nodes, and finally update those nodes of the old version, you can let the website in the update system uninterrupted external services.
2. Segmentation method
In distributed systems, you need
Here are three commonly used segmentation methods.
(1) Horizontal segmentation method
The horizontal segmentation method, as its name implies, is to deploy the same system on multiple machines, so that each machine has the same application and can complete the calculation independently without interference. The diagram below:
Optimal point:
-
Simple * * * *
: You only need to implement a routing algorithm to allocate requests to each node. There are currently Nginx, Netflix Zuul and SpringCloud Gateway to implement this feature.
-
One * * * *
: Each node has complete computing functions and does not need to depend on other nodes, so there is not much interaction between systems.
-
** High availability **
: When a node does not work, the system can continue to operate without downtime, because the routing algorithm does not allocate requests to the node that does not work.
-
** scalable **
: Service nodes can be added as the service grows, or reduced as the service shrinks. Both are easy.
-
** High performance **
: Because it is done in a single machine, there is no need to do external calls, so it can get very high performance.
Lack of points:
-
** Reduces maintainability **
: If product services are upgraded, all nodes need to be upgraded, which is not convenient for system maintenance.
-
** Reduces scalability **
: Horizontal segmentation method needs to concentrate all the business development in a set of system, high coupling degree, no future maintenance and expansion.
-
** Makes the system unreliable and unstable **
: When upgrading a system through packaging, it is easy to make the system unstable and unreliable.
(2) Vertical segmentation method
The vertical segmentation method is to go to
In order to solve the problem that business becomes extremely complicated due to the increase and depth of business and the explosion of users, each business is developed and maintained independently. The diagram below:
Optimal point:
-
** Enhance business independence **
: Simply dividing the system into highly cohesive, low-coupling modules based on business can greatly reduce development difficulty.
-
** Improves flexibility **
: When a service changes, only related systems need to be maintained, instead of all systems being packaged and put online.
-
** Improves maintainability **
: Separate systems are more likely to detect problems because exceptions can be more easily located when the business is separated, making it easier for developers and business people to maintain.
Lack of points:
-
** Increased collaboration between systems **
: Systems need to cooperate to complete tasks. For example, users need to cooperate with each other to complete the business operation of purchasing (trading) commodities.
-
** reduces availability **
: Because systems depend on each other, the failure of one system affects other systems. For example, if there is a problem with the product system, the user cannot complete the operation of purchasing goods.
-
** Data consistency is not guaranteed **
: Because systems need to communicate, and network communication is often not reliable, data consistency between nodes is difficult to ensure.
(3) Mixed segmentation method
Hybrid sharding is a combination of horizontal and vertical sharding and is the sharding method adopted by most microservices architectures today. For example, the system is divided into different server clusters according to service dimensions, and then each service is horizontally segmtioned so that each service system can run on multiple nodes. The diagram below:
The hybrid segmentation method not only combines the advantages of vertical segmentation, but also has the advantages of horizontal segmentation. Compared with the coupling and lack of flexibility, the hybrid segmentation method is easier to deal with the problems of massive interaction and data consistency, so it has gradually become the mainstream segmentation method nowadays. However, it still cannot overcome the problems of massive interaction between systems and data consistency that is difficult to maintain. Meanwhile, too many nodes will increase the hardware cost of implementing distributed system.
3. Facing problems
In the case of non-single node, distributed system can only cooperate through network communication, which makes it have many uncertainties. Therefore, network unreliability (such as packet loss, delay, etc.), transmission rate and other factors have many limitations on distributed system, which can be summarized as follows:
-
** Heterogeneous machines and networks **
In distributed systems, machine configurations, architectures, performance, systems, and so on are different. In different networks, communication broadband, delay, packet loss rate is also different. So in a multi-machine distributed system, how to make all the machines work together to serve the same business goal is a very complicated problem.
-
** Common node failure **
In distributed systems, there are many machines that cannot continue to work for some reason (such as power outages, disk damage). How the distributed system can find them and automatically weed them out and assign requests to the nodes that work so that the system can continue to provide service is also a problem to face.
-
** Unreliable networks and machines **
: The interaction between multiple machines is carried out through the network, and the network transmission is bound to occur separation, delay, out of order, packet loss and other problems. Machines can also reduce their processing capacity due to increased requests, which can affect user loyalty to the site.
To sum up:
.
4. Measurement criteria for distributed systems
Given that distributed systems can solve so many problems, what kind of distributed system design is good? In Distributed Systems: Principles and Specifications, some agreed metrics are presented as follows:
-
** Transparency **
: The so-called transparency refers to a distributed system externally as a stand-alone system, users do not need to know its internal implementation, only need to know its parameters, functions and return results.
-
** Scalability **
: If all existing nodes in the distributed system cannot meet service expansion requirements, new nodes can be added to cope with the increase of service data. When services shrink, more nodes are needed to save resources.
-
** Availability **
: In general, distributed systems can provide services around the clock, even in the case of failure, as much as possible to provide external services. Therefore, its availability can be measured by the ratio of normal service time to unavailable time.
-
** Reliability **
: Reliability. Data must be correctly calculated and stored without loss.
-
High performance: Requests can be processed faster because multiple nodes share the requests. Combined with the fact that each node can handle requests with high performance, the performance of distributed systems is much higher than that of stand-alone systems.
-
** Consistency **
: Because the distributed system adopts multi-node, when processing a business, multiple machines need to cooperate to process data. However, due to network communication reasons (packet loss, delay, instability, collaboration timing disorder, etc.), data will be inconsistent or lost. For data such as amount, errors and loss are not allowed, so how to ensure data consistency and prevent loss is an important measurement standard of distributed system.
Three, distributed system design principles
Due to the complexity of distributed system, some scholars and experts put forward many solutions, among which CAP principle and BASE principle are the most famous and influential ones.
1. CAP principles
Before we get to CAP principles, let’s talk about the main characteristics of distributed systems: consistency, availability, and partitioning tolerance. Directly on the concept, as follows:
-
** Consistency **
: Keeps all nodes have the same, logically consistent data at the same time.
-
** Availability **
: Ensures that each request receives a response regardless of success or failure.
-
** Partition tolerance **
: Any information loss or failure in the system does not affect the system operation.
Therefore, according to these three characteristics, In July 2000, Professor Eric Brewer of University of California at Berkeley proposed CAP conjecture at ACM PODC conference. Two years later, Seth Gilbert and Nancy Lynch of the Massachusetts Institute of Technology proved CAP theoretically. Since then, CAP theory has officially become the accepted theorem in distributed computing. It points out that no distributed system can satisfy all three. Naturally, it conforms to the set relation as shown in the figure below.
** says: ** -
**CA**
: A system that meets the requirements of consistency and availability is difficult to build a tree in scalability.
-
**CP**
: A system that meets consistency and partition tolerance is usually not particularly high-performance.
-
**AP**
: Systems that meet availability and partition tolerance generally have lower requirements for consistency but higher performance.
-
** The blank part **
: Any system can only do two of these metrics well, but not three.
2. BASE Theory
Before we get to BASE theory, let’s look at two concepts — “strong consistency” and “weak consistency” (consistency in CAP principle is strong consistency).
-
** Strong consistency **
: After the user completes the data update operation, any subsequent thread or other node can access the latest value. According to CAP principles, the pursuit of strong consistency requires a significant performance sacrifice.
-
** Weak consistency **
: After the user completes the data update operation, there is no guarantee that subsequent threads or other nodes will immediately access the latest value. It can only be done in some way to ensure final consistency.
BASE theory was formally put forward by Dan Prichett, an architect at eBay, in an article published in ACM. It is a practical summary of most distributed systems. Its core idea is:
. The core content of its BASE theory is:
. Because the integration and expansion of distributed system itself is quite complex, if the need to ensure strong consistency, it needs to introduce many additional complex protocols, which leads to technical complexity, but also has an impact on performance. BASE theory suggests that data should be inconsistent for a period of time to reduce the complexity of technical implementation and improve the performance of the system, and finally achieve data consistency through other means.
BASE isn’t actually a word, it’s a combination of professional abbreviations like BA, S and E, so make no mistake. So what do they represent?
-
**BA (Basically Avalilable) **
In layman’s terms, the most important requirement in a distributed system is to ensure basic availability and return of response results. For example, “Double Eleven chop hands festival”, when you are buying goods, if the buying fails at this time, then it will prompt a “system is busy, please come back later”, if it does not give you a response, then you have to wait for chop hands, you think about how painful it is. Therefore, when the system gives the user a clear message, can let the user do not have to do a long wait for the user to provide a good service.
-
**S (Soft State) **
: Its meaning is to allow the system to exist in intermediate states. Generally speaking, data communication between systems will have copies, and these copies will have some delay. In this case, it is recommended to use consistency instead of strong consistency to improve system availability and performance. In the user experience of a website, quick results are often more important than consistency, because no one wants to use a website that doesn’t respond for more than ten seconds.
-
**E (Eventual Consistency) **
: Indicates that all data copies in the system reach a consistent state after a certain period of time to ensure data correctness.