preface
Because everyone has different ways of reading source code, the reading process here does not represent best practices, but a reflection of the reading process itself. So if there are any missing points, please point them out for me in the comments. In the past, many friends raised some of my mistakes or improvements, so I would like to express my heartfelt thanks.
Then, according to the previous routine, we propose two tasks, and the whole chapter is carried out to complete and verify the following tasks:
1. DataNode initialization: When we usually build a cluster, we can see the services of DataNode through the JPS command, so DataNode should be the server of RPC
2. Registration of Datanodes: HDFS is a master-slave architecture, NameNode is the master node and DataNode is the slave node, so datanodes need to be registered with NameNode when starting.
3. Heartbeat mechanism: The slave node needs to send heartbeat to inform the master node of the slave node’s survival
4. How does NameNode manage metadata
(Added) Principle of Hadoop HA solution
First a simple to fix it, because directly on the source really sleepy
In hadoop1. x, we only have one NameNode and DataNode. NameNode is responsible for managing metadata, and DataNode is used to store data. To ensure data security, each copy has three backups, and each block occupies 64M in size
NameNode has a single point of failure, so the Hadoop team started to solve the problem. Because NameNode manages the metadata of the cluster, it’s a stateful service, which means it can’t be stopped. Since the problem is that there’s only one NameNode in the pot, why don’t we increase the number of namenodes? How can a NameNode ensure metadata consistency if it is added without any reason?
To solve this single point problem, first of all, how do the two Namenodes ensure the consistency of metadata
① Use the Linux shared storage directory
This is essentially keeping our cluster metadata out of the Namenode
Instead, they put it in a shared storage unit, which was recommended by Apache at the time, but no one bought it. I’m sure no one does that anymore
2 Use the ZooKeeper cluster
One NameNode goes to the ZooKeeper cluster to write metadata, and the other one goes to ZooKeeper to read metadata. Some companies actually use this solution
(3) cloudera QJM
The third solution, proposed by Cloudera, uses a Journalnode cluster, which is not too different from ZooKeeper and has the same logic
In addition, it is also a cluster itself, and the health criterion is that half of the nodes survive. For example, if one of the three nodes in my figure fails, it is 2/3>0.5. Then, the cluster is judged to be healthy.
You might ask, why did you just decide that the first NameNode was going to write metadata and the second one was going to synchronize metadata? Because they have their own state problems
The active NameNode writes data to the cluster, while the standby NameNode reads data to the cluster
If NameNode(active) suddenly goes down in the early morning, then I have to turn on my computer, connect to the company service, and use the command to forcibly set the NameNode in the standby state to active to restore the cluster.
How do we guarantee automatic switching of NameNode states?
④ Switch the NameNode state
Create a directory in ZooKeeper and namenodes will preempt the lock when they are started. The first namenodes to grab the lock will be in active state.
In addition, there is also a ZKFC service on each NameNode, which continuously monitors the health status of NameNode. If the active NameNode has a problem, ZKFC will report to ZooKeeper, and ZooKeeper will assign the lock to the standby NameNode. The system automatically switches to active
DataNode initialization (source snippet based on Hadoop2.7.0)
We directly find the main method of DataNode, which is similar to NameNode. After judging a parameter, exit if it is not satisfied. There is only secureMain(ARGS, NULL) left, and the core code is this one
Click creataDataNode and there is a creataDataNode, so I like this one the most. Click creataDataNode
Copy the notes first and paste them into Baidu
Instantiate (if) : instantiate (if) : instantiate (if) : instantiate (if) : instantiate (if) : instantiate This startup is just a bunch of threads
Look at the source code with a purpose in mind, otherwise it will be taken away by some other code, such as try, try, or look at the last word in a long list of words, such as if, Configuration, args, parameter set, we don’t care about these. Now we want to know how you instantiate, so just look at the instance thing, okay
The checker checker returns a DataNode. The checker checker returns a DataNode. Let’s order this one then
So what you see is a bunch of configuration parameters, so let’s go ahead and pull it to where we want it to be, and we’ll get to line 465 or so and we’ll have a try, so let’s see
StartDataNode actually goes through several processes
We’ve seen the code to start the DataNode, so this is what we want to see
① DataXceiver
Add: roughly 1182 lines, initDataXceiver(conf) initializes our DataXceiver, what does this do, click in
As we know, NameNode is only responsible for managing metadata information in the cluster, but the real data is stored on datanodes. In fact, Datanodes receive uploaded data through the DataXceiver service
Around line 974 there is a section of code that is set to background thread, which means that the thread and the main thread live and die together, and if the main thread ends, the thread will also stop.
(2) startInfoServer
Go back to StartDataNode
NameNode startup process 1.4.1 Parsing NameNode startup process, there is also a similar startHttpServer. At the time, startHttpServer was bundled with servlets to enhance its functionality. Our DataNode, too, will bind servlets to enhance itself
Checksum = ‘checksum’; checksum = ‘sum’; checksum = ‘sum’
③ Initialize the RPC service
Go back to StartDataNode again and see initIpcServer
InitIpcServer is used by the service that started RPC.
We are familiar with this code, which is very similar to the Hadoop source chapter – NameNode startup process 1.6.2 verification of the existence of the set parameter process. And also create a server is completed, it is binding a lot of agreement, such as the DataNode and communicate between the DataNode interDatanodeProtocolXlator, and also the agreements also have a unique versionID
(4) create BlockPoolManager
The function of blockManager is given in the comment, so let’s click on refreshNamenodes
If you see do, go ahead and click on it, it’s over 100 lines of code, it needs to be highlighted, but only the two most important points
Determine how many different pieces of metadata exist
It determines whether each new nameservices is an update to an existing nameservices or a brand new nameservices
The HDFS cluster we set up is a HA high availability architecture, that is, NameNode is divided into active and standby, and they manage the same metadata, so they manage the same Nameservices. Nameservices is just a directory for metadata. For federations, each federation manages one piece of metadata, so two federations will naturally have two different pieces of metadata.
Traverse the federal
Create a BPOfferService for each federation and a BPServiceActor for each NameNode in the federation
Here the initialization steps have been completed, is it a face of confusion, nothing, and the registration process
2. DataNode registration process
There’s a startAll() at the end of that position, that’s the main logic for registration, so click in
You can see here that it traverses the federation, and then we click on the start() method
To do this, first iterate over the federation, then iterate over the Namenodes in the federation, and then register the Datanodes to iterate over the namenodes, continue to click start
Calling the start method of a thread is actually calling the run method, which you probably already know.
ConnectToNNAndHandshake () Handshake with NameNode An infinite loop is used to ensure that the registration will be executed, and if anything goes wrong, set it back to sleep for 5 seconds and try again. If the execution succeeds, break. I’m gonna die here anyway, and I want you to succeed.
2.1 connectToNNAndHandshake ()
ConnectToNNAndHandshake () should be executed successfully, so click on it
In this case, the main goal is to register the DataNode by calling the NameNode method. In fact, it is to store the DataNode information to the NameNode. Here we have the proxy of the NameNode, why do we need the proxy here
The proxy object role contains internal references to real objects to manipulate them, and the proxy object provides the same interface as the real object so that it can replace the real object at any time. At the same time, a proxy object can perform operations on the real object with additional operations attached, which is equivalent to encapsulating the real object.
So at this point we get a proxy object for bpNameNode and register it with register()
2.2 DataNode Registration Information createRegistration
Further down you can see the DatanodeRegistration property field for those interested
And we can also find that in this process, our host name, port number, etc., have been sent together
RegisterDatanode after DataNode information is obtained
Back to the register ()
Because the registration code for the NameNode is also very important, we use the same dead-loop to ensure that the middle registration process is executed. If the registration is successful, break, and if the exception is delayed for one second and retry.
At this point, we see that a registerDatanode method is called through the bpNameNode object, which obviously calls the NamenodeRpcServer method of the same name. If you don’t believe me, go in, open up NamenodeRpcServer and CTRL + F registerDatanode
I’m going to go ahead and click on it
We have a try, we have a try, we have a try, we have a datanodes manager, we have a Datanodes manager, we have a register
It’s a long code, so we’re going to go all the way down to about line 995, and we see an addDataNode
addDatanode(nodeDescr)
The nodeDescr parameter is the registration information for a DataNode encapsulated in createRegistration
The way to register information is to fill in these data structures, like this datanodeMap, click on it, okay
The first parameter is the unique identifier of the DataNode, and the second parameter, the description of the DataNode, must be the registration information of the DataNode, which is stored in different data structures
addDatanode(nodeDescr)
The nodeDescr parameter is the same as the above parameter, which is the registration information of DataNode
The nice thing is that when you’re going to iterate through the heartbeat, you’re going to iterate through the data structure of the DataNode descriptor instead of having to iterate through the DataNode first and then get the heartbeat information out of them one by one
In general, registration is all about storing information into data structures, and once written, registration is complete
Three, start sending heartbeat
Skip back to 2.1 connectToNNAndHandshake() and look down at about 890 lines of the BPServiceActor class. If connectToNNAndHandshake succeeds, break off and start sending heartbeat steps
This loop is really an infinite loop, there is no break, the exception will only retry, so we click in to see
Heartbeat interval between DataNode and NameNode
If (starttime-lastheartbeat >= dnconf. heartBeatInterval) is used to check if the current time – the lastHeartbeat time is greater than dnconf. heartBeatInterval. So let’s see how long this heartBeatInterval is, click in
Take a look at DFS_HEARTBEAT_INTERVAL_DEFAULT. The default value is 3
So we’ll see later that datanodes and Namenodes maintain a heartbeat every three seconds.
HeartbeatResponse resp = sendHeartBeat(), the HeartbeatResponse is the command sent by NameNode to DataNode. Click sendHeartBeat().
When you find the heartbeat, some DataNode information will also be attached to it. For example, the first “reports” means reports. Click on it to check
You see, it’s about memory capacity, memory usage, etc
Send the heartbeat
Send the heartbeat by calling NameNode’s proxy object bpNamenode, which calls NameNode’s sendHeartbeat method directly. We go directly to NamenodeRpcServer to find the method with the same name
Those parameters come in, so let’s move on
The CMDS is the instruction that NameNode wants DataNode to operate on. It is returned to DataNode as a parameter wrapped in a HeartbeatResponse object
We’re trying to look at ways of dealing with the heartbeat
HandleHeartbeat handleHeartbeat
See here in fact, there is a kind of repeated code has been looking at the feeling, but reading the source code is so tangled process
The first getDatanode retrieves the corresponding DataNode based on the unique ID number of each DataNode
The updateHeartbeat method goes all the way down to this method that updates the heartbeat state
Here because we DataNode has always been at work, it must keep its own status updates, and very important is that it must be modified the last time the heartbeat of time, because we have just seen, we are through the current time – the last time the heartbeat time = 3 s, will send a heartbeat again, so this step is very important. We also need to use the heartbeat indicator to determine whether the DataNode is alive or not, because if the heartbeat is not established after a certain period of time (this value is self-configured), NameNode will determine that the node has failed and then it will let other blocks copy one more copy of data.
So here’s the DataNode.