In the previous article [First Hadoop instance: juejin.cn/post/684490…] JDK and Hadoop have been installed and environment variables have been configured. This article is about deploying distributed Hadoop on the basis of the original stand-alone version.
Vm Preparations
Clone three VMS configured in the previous part, change the IP addresses, disable the firewall, and set the host name. These basic operations are not the focus of this article and will not be covered here. Note: After changing the host name, modify the /etc/hosts file.
The virtual machine configuration in this example:
Hadoop001:192.168.48.111
Hadoop002:192.168.48.112
Hadoop003:192.168.48.113
SSH login without a password
-
Go to my home directory
cd ~/.ssh Copy the code
-
To generate public and private keys, type the following command and press Enter three times to generate two files id_rsa (private key) and id_rsa.pub (public key).
ssh-keygen -t rsa Copy the code
-
Copy the public key to the target machine where you want to avoid secret login
SSH - copy - id 192.168.48.112 SSH - copy - id 192.168.48.113Copy the code
Configure the cluster
Cluster Deployment Planning
Modifying a Configuration File
Directory where the configuration file resides: /opt/module/hadoop-2.7.2/etc/hadoop/
-
core-site.xml
<configuration> <! -- specify HDFS NameNode address --> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop001:9000</value> </property> <! -- Specify the file storage directory generated by hadoop runtime --> <property> <name>hadoop.tmp.dir</name> <value>The/opt/module/hadoop - 2.7.2 / data/TMP</value> </property> </configuration> Copy the code
-
hadoop-env.sh
exportJAVA_HOME = / opt/module/jdk1.8.0 _131Copy the code
-
hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop003:50090</value> </property> </configuration> Copy the code
-
slaves
hadoop001 hadoop002 hadoop003 Copy the code
-
yarn-env.sh
exportJAVA_HOME = / opt/module/jdk1.8.0 _131Copy the code
-
yarn-site.xml
<configuration> <! -- Site specific YARN configuration properties --> <! -- Reducer obtain data --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <! -- Specify the ADDRESS of ResourceManager in YARN. <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop002</value> </property> </configuration> Copy the code
-
mapred-env.sh
exportJAVA_HOME = / opt/module/jdk1.8.0 _131Copy the code
-
mapred-site.xml
<configuration> <! -- Set Mr To run on YARN --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> Copy the code
Copy the configuration file to other VMS in the cluster
[root @ hadoop001 etc] # SCP - r hadoop/root @ hadoop002: / opt/module/hadoop - 2.7.2 / etc/root @ hadoop001 etc # SCP -r hadoop / Root @ hadoop003: / opt/module/hadoop - 2.7.2 / etcCopy the code
The cluster start
-
If the cluster is started for the first time, you need to format the Namenode
[root@hadoop001 hadoop-2.7.2]# bin/hdfs namenode -format Copy the code
-
Start the cluster.
[root @ hadoop001 hadoop - 2.7.2] # sbin/start - DFS. ShCopy the code
-
Start YARN. Note: Start YARN on the machine where ResouceManager is located
[root @ hadoop002 hadoop - 2.7.2] # sbin/start - yarn. ShCopy the code
-
Check whether the startup is successful
Browser visit: http://192.168.48.111:50070
Browser visit: http://192.168.48.112:8088/cluster
Hadoop start and stop mode
-
Each service component starts one by one
(1) Start the HDFS components respectively
hadoop-daemon.sh start|stop namenode|datanode|secondarynamenode
(2) Start YARN
yarn-daemon.sh start|stop resourcemanager|nodemanager
-
Start each module separately (SSH configuration is prerequisite)
(1) Start or stop the HDFS
start-dfs.sh
stop-dfs.sh
(2) Start or stop YARN
start-yarn.sh
stop-yarn.sh
-
Start all (not recommended)
start-all.sh
stop-all.sh
Operation of the cluster
-
Create an INPUT folder on the HDFS file system
[root @ hadoop001 hadoop - 2.7.2] # bin/HDFS DFS - mkdir -p/user/sixj/graphs/wordcount/inputCopy the code
-
Upload the test content to the file system
[root @ hadoop001 hadoop - 2.7.2] # bin/HDFS DFS - put wcinput/wc. Input/user/sixj/graphs/wordcount/inputCopy the code
-
Viewing the File List
[root @ hadoop001 hadoop - 2.7.2] # bin/HDFS DFS - ls/user/sixj/graphs/wordcount/inputCopy the code
-
Viewing file Contents
[root @ hadoop001 hadoop - 2.7.2] # bin/HDFS DFS - cat/user/sixj/graphs/wordcount/input/wc. Input the hadoop yarn hadoop mapreduce sixj JAVA sixjCopy the code
-
Download the file to the local PC
[root @ hadoop001 hadoop - 2.7.2] # hadoop fs - get/user/sixj/graphs/wordcount/input/wc. InputCopy the code
-
Delete the file
[root @ hadoop001 hadoop - 2.7.2] # hadoop fs - rm/user/sixj/graphs/wordcount/input/wc. InputCopy the code