Edits file: The HDFS file that NameNode operates locally is saved in the Edits log file. This means that when any metadata operation is generated in the file system, it is recorded in the Edits log file. Eg: Create a file in HDFS, NameNode will insert a record in Edits. Also, if you modify or delete the operation, the Edits log file will also add a data. FsImage image file: The data block to file mapping, file attributes, and so on are stored in a file called FsImage, which is also in the file system where the NameNode resides.
2. Process Introduction: ① Load fsimage image file to memory ② load EDits file to memory ③ Merge fsimage image file and EDits file ④ Write merged file to fsimage ⑤ Clear original EDits data Use an empty edits file for normal operations
3. Flow chart analysis:
image.png
4. Because NameNode only merges fsimage and Edits at the startup stage, the edits file may become bigger and bigger if it is run for a long time. The next startup of NameNode will take a long time. Can I periodically merge fsimage image files and edits log files? The answer is definitely yes. To solve this problem, we need to use Secondary NameNode. What are the main functions of Secondary NameNode? How did he merge Fsimage and Edits? Analyze again with doubt.
Ii. Working process of Secondary NameNode
1. Differences between Secondary NameNode and NameNode:
NameNode: 1, store file metadata, all data is stored in memory, this HDFS can store file is limited by NameNode memory. If NameNode fails, the HDFS fails. Therefore, ensure availability of NameNode. Synchronize with NameNode regularly, periodically merge fsimage image file and Edits log file, pass the merged file to NameNode, replace its image, and delete edit log. If the NameNode fails, you need to manually set it to the host. The directory where the latest checkpoint is saved has the same structure as the NameNode directory. Therefore, the NameNode can apply the checkpoint image on the Secondary NameNode when needed.
The default value is 3600 seconds. You can change the value by setting fs.checkpoint.period. If the value exceeds the maximum value, the Edits log files are merged even within one hour. The value can be fs.checkpoint.size. The default value is 64 MB.
① NameNode notifies the Secondary NameNode to checkpoint. The Secondary NameNode tells the NameNode to switch the edits log file to an empty one. Secondary NameNode obtains the fsimage image file on NmaeNode via Http (only for the first time) and the edits log file before switching. 4. Secondary NameNode merges fsimage and Edits files in the content. ⑤, Secondary NameNode sends the merged fsimage file to NameNode. ⑥ NameNode replaces the fsImage file from the Secondary NameNode with the fsImage file. 4. Flow chart analysis:
image.png