1. SecondaryNameNode explanation
1.1. Working mechanism of SecondaryNameNode
1) The first stage: Namenode startup
- 1: Start namenode for the first time. After formatting, create fsimage and edits files. If it is not the first startup, load the edit log and image file directly to the memory.
- 2: The client requests to add, delete, modify, and query metadata.
- 3: Namenode records operation logs and updates rolling logs.
- 4: Namenode adds, deletes, modifies and searches data in the memory.
2) Stage 2: SecondaryNameNode works
- 1: SecondaryNameNode asks Namenode whether to checkpoint. Return the result of whether namenode needs to be checked.
- 2: SecondaryNameNode requests to execute Checkpoint.
- 3: Namenode scrolls edits logs that are being written
- 4: Copy the edit log and image file of the rolling money to SecondaryNameNode
- 5: SecondaryNameNode loads the edit logs and image files to the memory and merges them.
- 6: Generates a new image file fsimage.checkpoint
- 7: copy fsimage.checkpoint to namenode.
- 8: namenode renames fsimage.checkpoint to fsimage
1.2. The differences and connections between NameNode and SecondaryNameNode
1) the difference between
- 1: NameNode is responsible for managing metadata of the entire file system and data block information corresponding to each path (file).
- 2: SecondaryNameNode is used to periodically merge namespace mirrors and namespace edit logs.
2) contact
- 1: SecondaryNameNode stores a mirror file (fsimage) and edit logs (edits) of the current NameNode.
- 2: When the primary Namenode fails (assuming that data is not backed up in time), data can be restored from the SecondaryNameNode.
1.3. Steps for commissioning new data nodes and decommissioning old ones
1) Bringing a node online Before bringing a new data node online, you need to append the data node to the dfs.hosts file
1. Disable the firewall on the new node. 2. Add the hostname of the new data node to the hosts file of the NameNode node. Add the SSH password-free login of the new node to the NameNode node 5, add the hostname of the new node 6 to dfs.hosts on the NameNode node, and refresh the information on other nodes: HDFS dfsAdmin -refreshNodes 7: On the NameNode node, change the file Slaves and add the hostname of the data node to slaves. 8. Start the NameNode node. 9Copy the code
2) The node is offline
1, modify /conf/ hdFs-site.xml file 2, identify the machine to be taken offline, dfs.oste.execlude file configured for the machine to be taken offline, this is the organization to take down the machine to connect to NameNode 3, /bin/hadoop dfsadmin -refreshNodes: Moves blocks in the background. Decommission Status:Decommission in progress :Decommission Status:Decommission in progress :Decommission Status:Decommission in progress Decommission Status:Decommissioned 5, the machine is off line, remove them from the excludes file.Copy the code