Hadoop master and slave run in different Docker containers. NameNode and ResourceManager run in hadoop-master container, DataNode and NodeManager run in Hadoop-slave container. NameNode and DataNode are components of Hadoop distributed file system HDFS to store input and output data. ResourceManager and NodeManager are components of Hadoop cluster resource management system YARN to schedule CPU and memory resources.
Let’s plan the cluster first:
Let’s explain some of the software above
NameNode refers to the NameNode in HDFS. It is divided into an active and standby mode to ensure high availability.
The active node is in active state and the Standby node is in Standby state. The active/Standby switchover is completed using DFSZKFailoverController and ZooKeeper.
In addition, to ensure high availability of edit logs on master, three JournalNodes are created.
Next, yarn is a resource management system that centrally manages and schedules clusters
Above is the introduction part, next we will implement the high availability cluster construction
The first step is to go to the Hadoop directory and run docker-compose up -d
Step 2 Run./start-all.sh
After completing the above steps, we are ready to view the data from the console output.
Here are some cluster validation operations:
1. Verify that HDFS works properly and HA is highly available
First, upload a file to HDFS
/usr/local/hadoop/bin/hadoop fs -put /usr/local/hadoop/README.txt /
Manually disable the Active Namenode on the active node
/usr/local/hadoop/sbin/hadoop-daemon.sh stop namenode
Check whether the standby Namenode status changes to Active through port HTTP 50070
Manually start the Namenode closed in the previous step
/usr/local/hadoop/sbin/hadoop-daemon.sh start namenode
2. Verify that YARN works properly and that ResourceManager HA is highly available
Run the WordCount program in the demo provided by testing Hadoop:
/usr/local/hadoop/bin/hadoop fs -mkdir /wordcount
/usr/local/hadoop/bin/hadoop fs -mkdir /wordcount/input
/usr/local/hadoop/bin/hadoop fs -mv /README.txt /wordcount/input
/ usr/local/hadoop/bin/hadoop jar share/hadoop/graphs/hadoop – graphs – examples – 2.7.4. Jar wordcount/wordcount/input /wordcount/output
Verify the ResourceManager HA
Manually stop ResourceManager on Node02
/usr/local/hadoop/sbin/yarn-daemon.sh stop resourcemanager
Access ResourceManager on Node01 through HTTP port 8088 to check the status
Manually start ResourceManager on Node02
/usr/local/hadoop/sbin/yarn-daemon.sh start resourcemanager
Code making address: https://github.com/zhuanxuhit/distributed-system/issues/4 welcome the attention
Your encouragement is my motivation to continue writing, and I look forward to our common progress.