Welcome to follow our wechat official account: Shishan100My new course “C2C E-commerce System Micro-service Architecture 120-day Practical Training Camp” has been launched on the public account Of Ruxihu Technology Nest. If you are interested, you can click the link below for details:120-Day Training Camp of C2C E-commerce System Micro-Service Architecture
directory
(1) Terabyte data on one machine: difficult!
(2) What exactly is distributed storage?
(3) What is a distributed storage system?
(4) Good heavens! What if a machine goes down?
(5) How does the Master node sense the disappearance of the data copy?
(6) How to keep enough copies
(7) What about deleting redundant copies?
(8) Summary of the whole paper
“In this article, we will talk about fault-tolerant architecture design for large-scale distributed systems in very simple language. Although the positioning is “distributed”, “fault-tolerant architecture” and other words seem slightly complicated, but we still follow the old rules: plain English + hand-painted several color pictures, step by step, so that each student can understand the design idea of this complex architecture.
1. Terabytes of data on one machine: Hard!
Let’s talk about fault-tolerant architecture design using distributed storage system as an example.
First, what is a distributed storage system?
It’s so simple, let’s take a table from the database.
Let’s say you have a database, and you have a really big table in that database, with billions, maybe billions of data in it.
Further, suppose that this table contains tens of terabytes of data, or even hundreds of terabytes of data. What do you think?
Panic and helplessness, of course, because if you use a database like MySQL, there may not be enough disk space on a single database server to hold this table!
Let’s take a look at the picture below to get a feel for it.
2. What exactly is distributed storage?
So, if you have a huge data set, hundreds of terabytes! Then you should forget about traditional database technology.
Since a single database server probably wouldn’t fit, let’s think about distributed storage, right? Right! This is the way to solve the problem.
We can have more than one machine! If you have 20 machines, you put 1/20 of the data on each machine.
For example, for a total of 20 terabytes of data, only 1 terabyte can be used on each machine. Every machine can easily and happily put down so much data.
So, taking a large data set and breaking it up into multiple pieces and putting them on multiple machines is called distributed storage.
Let’s look at the picture below.
So what is distributed storage system?
What is a distributed storage system?
Distributed storage systems, of course, are systems that take a large data set and divide it into chunks, store it on multiple machines, and then centrally manage the data stored on multiple machines.
For example, classical Hadoop is one of these systems, and Fastdfs is similar.
Elasticsearch, Redis Cluster, and so on are all systems that are essentially the same if you can open your imagination to the idea of common nature.
These are all based on distributed system architectures, which break up huge data into multiple pieces for you to store on multiple machines.
This article is from the distributed system architecture level, does not adhere to any technology, so we can set: this distributed storage system, there are two processes.
A process is a Master node, located on a single machine, that centrally manages data distributed across multiple machines.
Another set of processes is called Slave nodes. Each machine has a Slave node that manages the data on that machine and communicates with the Master node.
Let’s take a look at the picture below and see the above description visually.
4. Good Heavens! What if a machine goes down?
Here’s another problem. What if one of those 20 machines goes down?
This is embarrassing, man, because you end up with a complete copy of 20 terabytes of data, 19 terabytes of data, and 1 terabyte of data lost because the machine crashed.
So of course you can’t allow that to happen, and then you have to do a data copy strategy.
For example, we can make two copies of the 1TB of data on each machine redundant and put it on the other machine, and then, if one machine goes down, it’s fine, because there are copies of it on the other machine.
Let’s take a look at this multi-copy redundant architecture design.
The light blue “1TB data 01” in the figure above represents the first 1TB data shard in a 20TB data set.
As you can see, he has three copies, with light blue squares on each machine representing his three copies.
In this case, one copy of data has three copies. Other data are similar.
At this time, we assume that a machine goes down, such as the following machine, which will inevitably result in the loss of one of the data copies of the “1TB data 01” data shard. As shown below:
Does it matter now? It doesn’t matter, because he has 2 other copies of the “1TB data 01” shard on the two surviving machines!
So if someone wants to read the data, they can just pick a copy from the other two machines and read it. The data won’t be lost. Don’t worry, big brother.
5. How does the Master node sense the disappearance of the data copy?
Now we have a problem, let’s say a brother wants to read the shard “1TB data 01”, then he will go to the Master node and say:
“Can you tell me where the data sharder of” 1TB data 01 “is? What machine is it on? I need to read him!”
Let’s take a look at the picture below.
The Master node needs to select one of the three copies of “1TB data 01” and say:
“Brother, on which machine, there is a copy, you can go to that machine to read a copy of” 1TB data 01 “ok.”
However, the problem is that the Master node does not know that the copy 3 of “1TB data 01” is missing, so if the Master node still notifies others to read the copy 3 of “1TB data 01”, it will not be able to.
So, how do we let the Master know that replica 3 is missing?
The Slave node on each machine, which manages the data, sends a heartbeat to the Master node every few seconds (say one second).
Therefore, if the Master node detects that it has not received a heartbeat from a Slave node for a period of time (say 30 seconds), it will assume that the machine on which the Slave node is located is down and that all copies of the data on that machine are lost, and the Master node will not tell anyone to read the lost copy.
If you look at the diagram below, once the Slave node is down and the Master node cannot receive the heartbeat, it will assume that the copy 3 on that machine has been lost and will never let anyone read the copy 3 on that machine.
At this point, the Master node can tell someone to read either copy 1 or copy 2 of “1TB of data 01”, since both copies are still there.
For example, a client can be told to read copy 1, and the client can approach the Slave node on that machine and say it wants to read copy 1.
The whole process is illustrated below.
6. Keep enough duplicate copies
There is another problem, that is, the “1TB data 01” shard now has only two copies of 1 and 2, which is not enough for 3 copies.
Because we’re supposed to have three copies of each shard. So if you think about it, how do you add a copy to this data shard?
Quite simply, once a Master node senses that a machine is down, it can sense that there are insufficient copies of a data shard.
At this point, a copy copy task is created, picking another machine to copy a copy from the machine that has the copy.
For example, if you look at the figure below, you can pick a fourth machine to make a copy from the second machine.
But now that the copy task is available, how do we let machine 4 know?
It’s actually quite simple. Doesn’t machine 4 send a heartbeat every second? When machine 4 sends a heartbeat, the Master sends the replication task to machine 4 via heartbeat response, telling machine 4 to make a copy from machine 2.
Again, here’s a picture of the process:
Look at the figure above, is there a copy 3 of “1TB data 01” on machine 4 now? Does the “1TB data 01” fragment become 3 copies again?
7. Delete redundant copies
On the other hand, if machine 3 suddenly recovers, it also has a copy 3 of “1TB data 01”, which is equivalent to 4 copies of “1TB data 01” at this time, isn’t the copy redundant?
It doesn’t matter. Once the Master node senses machine 3’s resurrection, it will find that there are too many copies and will generate a copy deletion task.
When machine 3 sends a heartbeat, it sends a copy deletion command, telling machine 3 to delete its own redundant copies. In this way, you can keep the number of copies to three.
Same thing. Let’s look at the picture below.
8. Summary of the paper
Well, here, through the explanation of super plain English, and more than ten diagrams of progressive evolution, I believe that even if you do not understand the distributed system before, are absolutely able to understand a distributed system of the complete data fault-tolerant architecture is how to design.
In fact, this mechanism of data fragmentation, multiple copy redundancy, downtime awareness, automatic copy migration, redundant copy deletion is similar to many systems like Hadoop, ElasticSearch, etc.
Therefore, I strongly suggest that you absorb the idea of fault-tolerant architecture of distributed system and middleware system.
In this way, when learning similar technologies in the future, their principles and ideas will feel a sense of deja vu.
End
If there is any harvest, please help to forward, your encouragement is the biggest power of the author, thank you!
A large wave of micro services, distributed, high concurrency, high availability of original series of articles is on the way
Please scan the qr code below and keep following:
Architecture Notes for Hugesia (ID: Shishan100)
More than ten years of EXPERIENCE in BAT architecture
Recommended reading:
1. Please! Please don’t ask me about the underlying principles of Spring Cloud
2. [Behind the Double 11 carnival] How does the micro-service registry carry tens of millions of visits of large-scale systems?
3. [Performance optimization] Spring Cloud parameter optimization practice with tens of thousands of concurrent applications per second
4. How does the microservice architecture guarantee 99.99% high availability under the Double 11 Carnival
5. Dude, let me tell you in plain English what Hadoop architecture is all about
6. How can Hadoop NameNode support thousands of concurrent accesses per second in large-scale clusters
7. [Secret of Performance Optimization] How does Hadoop optimize the upload performance of large TERabyte files by 100 times
8, please, please don’t ask me the implementation principle of TCC distributed transaction in the interview!
9, 【 pit dad ah! How do final consistent distributed transactions ensure 99.99% high availability in real production?
10, please, interview please don’t ask me Redis distributed lock implementation principle!
11, 【 eyes light up! See how Hadoop’s underlying algorithms elegantly improve large-scale cluster performance by more than 10 times?
12. How to support the storage and calculation of ten-billion-level data in the architecture of billion-level traffic system
How to design highly fault-tolerant distributed computing System
14. How to design a high-performance architecture for carrying ten billion traffic
How to design a high concurrency architecture with 100,000 queries per second
16. How to design the full link 99.99% high availability architecture for 100 million level traffic system architecture
17, seven pictures thoroughly explain the implementation principle of ZooKeeper distributed lock
18. What is volatile about Java concurrent interview Questions?
How to optimize CAS performance in Java 8?
What is your understanding of AQS?
What are fair locks and unfair locks in Java concurrent interview questions?
22, In general, talk about Java concurrent interview question micro service registry read/write lock optimization
23. How do interviewers of Internet companies inspect candidates without any dead Angle? (last)
24. How do Internet company interviewers inspect candidates without dead Angle? (the next)
Dude, why did you introduce message-oriented middleware into your system architecture?
【Java Advanced Interview series 2 】 : The elder brother, then you say system architecture introduction message middleware has what disadvantage?
27. [Walking Offer Harvester] Remember the interview experience of a friend who won the Offer of BAT technical expert
Guys, how does messaging middleware land in your project?
【Java Advanced Interview series 4 】 piercing heart! How can I ensure 100% data loss when online services are down?
Behind a JVM FullGC, there is a thrilling online production accident hidden!
31. [High Concurrency Optimization Practice] Will your system be overwhelmed by 10x request pressure?
How to ensure that millions of production data is not lost when the message-ware cluster crashes?
How to design scalable Architecture in Ten thousand concurrent Scenarios (PART 1)?
34. How to design scalable Architecture in Ten thousand Concurrent Scenarios (In)?
How to design scalable Architecture in tens of thousands of concurrent scenarios (Ii)?
36 billion level flow architecture second shot: your system is really invulnerable?
37. How to Ensure data consistency under Ten Billion Traffic Flow System Architecture (PART 1)
38. How to ensure data consistency in a Billion Traffic System architecture (middle)?
39. How to ensure data consistency under the condition of billions of traffic (ii)?
40, Internet interview must kill: How to ensure that the whole link data of message middleware is 100% not lost (1)
41. Internet Interview must kill: How to ensure that the whole link data of message middleware is 100% not lost (2)
Interview killer: How can messaging middleware achieve a hundredfold optimization of consumption throughput?
In high concurrency scenario, how to ensure that the message sent by the producer to the message middleware is not lost?
Author: Architectural Notes of Huoia Link: juejin.cn/post/684490… Nuggets copyright belongs to the author all, please contact the author to obtain authorization!