“This is the 11th day of my participation in the First Challenge 2022. For details: First Challenge 2022.”
In technology, distributed systems are becoming an inescapable term. The reason is that the data scale of this era does not match the storage and processing capacity of a single machine. So there are two ways: machine upsizing and machine interconnection. The former is expensive and inflexible, so the latter is increasingly popular. According to the law of cost conservation, cost will not disappear into thin air, hardware cost down, software design cost will increase. And distributed systems theory is the key to help us reduce the cost of this software.
What is the
Leslie Lamport [1], the founder of Distributed System, mentioned in one of his most important papers “Time, Clocks, and the Ordering of Events in a Distributed System” [2] :
A system is distributed if the message transmission delay is not negligible compared to the time between events in a single process.
Lamport uses something akin to relativity to explain the problem. We consider two time scales: inter-process messaging delay and intra-process event interval. If the former is not negligible relative to the latter, then the group of processes is a distributed system.
To understand this definition, you need to understand several important concepts (as formal definitions always do, hand on hand) : processes, messages, and events. In order to avoid dolls, there is not too much expansion here, but only a visual understanding: process is a labor responsible for work, the work can be broken down into multiple steps, each step is an event, the message is the way of labor communication.
Distributed Computing [3] : Distributed Computing (distributed computing)
- There are several autonomous computational entities ( computers or nodes ), each of which has its own local memory.
- The entities communicate with each other by message passing.
It involves some of the most important resources in computer systems: computations, memory, and the networks that communicate them.
To sum up, we can describe distributed systems from another perspective:
Externally, distributed systems act as a whole, providing specific functions based on the overall storage and computing capabilities.
Internally, the distributed system is represented as a group of individuals, which communicate based on network messages and work cooperatively.
The design goal of distributed systems is to maximize overall resource utilization while handling local errors and maintaining external availability.
Author: A miscellany of wood birdswww.qtmuniao.com/2021/10/10/…
What are the characteristics
When building distributed systems, there are some logical aspects to consider:
- Scalability: Scalability is an essential requirement of distributed systems, that is, the system design allows us to cope with increasing external requirements by simply adding machines.
- Fault tolerance \ Availability: This is a side effect of scalability, as systems get bigger and bigger, single machine failures become the norm. The system automatically handles these faults to maintain external availability.
- Concurrency: With no global clock to coordinate, scattered machines naturally live in “parallel universes”. The system needs to bootstrap these concurrences into collaborations to disassemble and perform cluster tasks.
- Heterogeneity (internal) : The system needs to deal with the differences of different hardware, operating systems, and middleware within the group, and can accommodate new heterogeneous components to the system.
- Transparency (external) : External shielding of system complexity, providing logical uniformity.
There are several types
When organizing distributed systems, there are physically several types:
- Master-workers: There is one machine that does the directing and other machines that do the work, such as Hadoop. The upside is that it is relatively easy to design and implement, and the downside is a single point of bottleneck and failure.
- Peer-to-peer architecture: All machines are logically equivalent. The amazon Dynamo, for example, has no single point of failure. The downside is that the machine is not well coordinated and consistency is not guaranteed. However, if the system is stateless, this architecture is appropriate.
- Multi-tier architecture: This is a composite architecture that is most commonly used in practice, such as storage and computing separation. Each layer can be designed for different characteristics (IO intensive, computing intensive) and can even reuse existing components (cloud native).
What are the pros and cons
Once again, the distributed system is due to the single capacity does not match the data scale of a helpless move. Therefore, in the system design, the priority is given to the stand-alone system. After all, the complexity of distributed systems increases exponentially.
Now let’s summarize the advantages and disadvantages of distributed systems.
advantages
High availability, high throughput, high scalability
- Unlimited scaling: With good design, increasing machine resources linearly can cope with increasing demand.
- Low latency: In multi-location deployment, user requests are routed to the nearest equipment room for processing.
- High availability and fault tolerance: some machines can still provide services normally when they are broken down.
disadvantages
The biggest problem is complexity.
- Consistency of data. Considering the large number of machine failures: downtime, restart, shutdown, data may be lost, stale, error, how to make the system accommodate these problems, external data correctness, requires a fairly complex design.
- The network or communication is faulty. The unreliability of the network, the possibility of message loss, early arrival, late arrival, and hangout, has brought great complexity to coordination between machines. Basic network protocols, such as TCP, can solve some of the problems, but more need to be handled by the system level itself. Not to mention the possibility of message forgery on the open web.
- Manage complexity. When the number of machines reaches an order of magnitude, it becomes a challenge to effectively monitor, log, and load balance them.
- Delay. The delay of network communication is several orders of magnitude higher than that of in-machine communication, and the more components and network hops, the higher the delay will be, which will ultimately affect the system’s external service quality.
reference
- Wikipedia Leslie Lamport:en.wikipedia.org/wiki/Leslie…
- Leslie Lamport Time, Clocks, and the Ordering of Events in a Distributed System lamport.azurewebsites.net/pubs/time-c…
- Wikipedia, distributed computing: en.wikipedia.org/wiki/Distri…
- Confluent distributed system complete guide: www.confluent.io/learn/distr…
- Splunk What is a distributed system: www.splunk.com/en_us/data-…
I am Green teng mu bird, a distributed system programmer like photography, welcome to pay attention to my public number: “Mu Bird miscellaneous Record”.