I. Environment configuration
Three VMS are available.
Because a cluster is made up of multiple servers, when design-to-configuration changes are made, the configuration of the cluster is required to be synchronized, so the same changes need to be made on each machine. When there are many servers in a cluster, the repetitive work is meaningless and cumbersome, so cluster distribution scripts are required.
1.1 SCP Security Copy
SCP can copy data from server to server. The basic syntax is as follows:
Command recursively to copy the file path/name User @ host objective: the purpose of path/name [root @ xiaolei module] # SCP - r/etc/profile. D/my_env. Sh root @ hadoop104: / etc/profile. D/aCopy the code
1.2 XSync Cluster distribution script
The difference between rsync and SCP: Using rsync to copy files is faster than using SCP. Rsync only updates different files. SCP is to copy all files over. Basic syntax:
Rsync -av $pdir/$fname $user@hadoop$host:$pdir/$fname Command Optional Parameters Path/name of the file to be copied Destination user@ host: destination path/name The options are described as follows: -a: archive copy -v: displays the replication processCopy the code
Use xsync to name distribution scripts:
Note the problem, in the /etc/profile file add path: /home/leip/bin as follows, then write the script here to achieve global access.
export PATH=$PATH:/home/lei/bin
Copy the code
Then create the bin directory under /home/lei and create the xsync file below:
[lei@xiaolei module]# cd /home/leilei
[lei@xiaolei leilei]# ls
[lei@xiaolei leilei]# mkdir bin
[lei@xiaolei leilei]# touch xsync
[lei@xiaolei leilei]# vim xsync
Copy the code
Script content:
#! If [$# -lt 1] then echo Not Enough Arguement! exit; Fi # 2. Traversal cluster all machines for host in hadoop102 hadoop103 do echo = = = = = = = = = = = = = = = = = = = = $host = = = = = = = = = = = = = = = = = = = = # 3. If [-e $file] then #5. If [-e $file] then #5 Pdir =$(CD -p $(dirname $file); Fname =$(basename $file) SSH $host "mkdir -p $pdir" rsync -av $pdir/$fname $host:$pdir else echo $file does not exists! fi done doneCopy the code
Testing: Scripts can be used.
[root@xiaolei bin]# xsync /home/leilei/bin/xsync ==================== hadoop101 ==================== root@hadoop101's password: root@hadoop101's password: Sending Incremental file list xsync sent 693 bytes received 35 bytes 208.00 bytes/ SEC Total size is 602 Speedup is 0.83Copy the code
Second, distributed cluster construction
2.1 Cluster Planning
Three machines: HadoOP101, HadoOP102, Hadoop103.
The user name is LEI. The user has the permission of the root user.
Resource allocation:
NameNode, Resourcemanager, and SecondarynameNode require more resources. Therefore, allocate these resources to different machines.
Datanodes are responsible for data storage. Because all three machines have storage resources, all three machines must have Datanodes. Nodemanager is responsible for resource management, because all three machines have computing resources, so all three need Nodemanager.
hadoop101: NameNode DataNode Nodemanager
hadoop102: Resourcemanager DataNode Nodemanager
hadoop103: SecondarynameNode DataNode Nodemanager
Copy the code
2.2 Cluster Construction
1. Delete the data and logs directories under the Hadoop installation directory on each node. This is the previous record.
2. Configure javahome in /etc/hadoop/hadoop-env. sh
Configure core-site: define a path variable
<! -- <property> <name>fs.defaultFS</name> <value> HDFS ://hadoop101:9820</value> </property> <! Dir in the official configuration file is hadoop.tmp.dir, which is used to specify the hadoop data storage directory. The hadoop.data.dir used in this configuration is a self-defined variable. Because the value of this configuration is used in hdFS-site.xml to specify the directory where namenode and Datanode store data --> <property> <name>hadoop.data.dir</name> < value > / opt/module/hadoop - 3.1.3 / data value > < / < / property >Copy the code
4. Configure hdFS-site
Manually specify the storage location
<! Replication </name> <value>1</value> </property> <! Dir </name> <value>file://${hadoop.data.dir}/name</value> </property> <! Datanode.data. dir</name> <value>file://${hadoop.data.dir}/data</value> </property> <! - specifies SecondaryNameNode data storage directory - > < property > < name > DFS. The namenode. Checkpoint. Dir < / name > <value>file://${hadoop.data.dir}/namesecondary</value> </property> <! Datanode-restart. Timeout </name> <value>30s</value> </property> <! --> <property> <name>dfs.namenode.http-address</name> <value>hadoop101:9870</value> </property> <! - 2 nn web side access address - > < property > < name > DFS. The namenode. Secondary. HTTP - address < / name > < value > hadoop103:9868 < value > / < / property >Copy the code
5. Configure yarn-site
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <! - specifies the address of the ResourceManager - > < property > < name > yarn. The ResourceManager. The hostname < / name > < value > hadoop102 < value > / < / property > <! Env --> <property> <name> yarn.nodeManager. env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP _MAPRED_HOME</value> </property> <! Vmem-check-enabled </name> <value>false</value> </property>Copy the code
6. Configure mapReduce mapred-site
Mapreduce.framework. name</name> <value>yarn</value> </property> <name>mapreduce.application.classpath</name> <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value> </property>Copy the code
7. Synchronize the configuration
Xsync/opt/module/hadoop3.1.3Copy the code
2.3 Cluster Single Point Of Startup
1. If the cluster is started for the first time, format the nameNode
hdfs namenode -format
Copy the code
2. Start HDFS
Start nameNode on Hadoop101
hdfs --daemon start namenode
Copy the code
Start datanode on Hadoop101, Hadoop102, and Hadoop103
hdfs --daemon start datanode
Copy the code
Start Secondarynamenode at Hadoop103
hdfs --daemon start secondarynamenode
Copy the code
http://hadoop103:9868/status.html
http://hadoop101:9870/dfshealth.html
3. Start YARN
Start Resourcemanager on Hadoop102
yarn --daemon start resourcemanager
Copy the code
Start NodeManager on Hadoop101, 102, 103
yarn --daemon start nodemanager
Copy the code
http://hadoop101:8088/cluster
4. Cluster testing
A. Create a /user/lei/input directory on the HDFS
[root@xiaolei hadoop-3.1.3]# hdfs dfs -mkdir -p /user/leilei/input
Copy the code
B. Upload local files to the created directory:
[root@xiaolei hadoop-3.1.3]# hdfs dfs -put wcinput/wc.input /user/leilei/input
Copy the code
C. How to view the specific storage in HDFS? There are three default copies of Hadoop. If one is configured, its location is random. HDFS file management system.
D. Test YARN
[lei @ hadoop103 hadoop - 3.1.3] $hadoop jar share/hadoop/graphs/hadoop - graphs - examples - 3.1.3. Jar wordcount /user/lei/input /user/lei/outputCopy the code
E, problem,If this problem occurs, there are two solutions. The first solution is to directly modify the configuration file and add HDFS configuration as follows:
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
Copy the code
This is not recommended for use in production environments.
The second option is to add directory permissions:
[lei@hadoop101 hadoop-3.1.3]$bin/ HDFS dfs-chmod -r 777 /Copy the code
2.4 SSH Encryption-free Login
1. Existing problems:
The startup and shutdown of a cluster are now operated by a single point of start and shutdown. What if there are many machines in the cluster?
2. Problem-solving ideas
You want to write a script for starting and shutting down a cluster
The general idea of the script:
Log in to Hadoop101 to start and close Namenode.
Log in to Hadoop102 to start and turn off 2nn.
Log in to hadoop101, hadoop102, and hadoop103 to start and close datanode
Log in to Hadoop102 to start and shut down Resourcemanager
Log in to Hadoop101, hadoop102, hadoop103 to start and close namenode
3. How do I log in to a remote machine
SSH IP/host name
[root@xiaolei hadoop-3.1.3]# SSH hadoop101 root@hadoop101's password: Last login: Wed Aug 26 10:41:51 2020 [root@hadoop101 ~]# exit Connection to hadoop101 closedCopy the code
4. Solve the problem that you need to enter a password for SSH login to the remote machine
Use the public and private keys for encryption-free login. Results: HaDOOP101, HaDOOP102, HaDOOP103.
[lei@hadoop101 /]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/lei/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/lei/.ssh/id_rsa. Your public key has been saved in /home/lei/.ssh/id_rsa.pub. The key fingerprint is: SHA256:73UcABIdwg3Vvosv8YTfiZbwnjf+p8kubOPcwZk/ST0 lei@hadoop101 The key's randomart image is: +---[RSA 2048]----+ | .=*+o | | .oo.. | |.. | |.. | | S . .. .| | .+ oo E.| | .X.=O.o| | .oo&==*.| | .B**B+=| +----[SHA256]-----+ [lei@hadoop101 /]$Copy the code
Authorization of other machines without confidentiality:
ssh-copy-id hadoop101
ssh-copy-id hadoop102
ssh-copy-id hadoop103
Copy the code
Similarly, other machines can perform similar operations to complete mutual non-secret login.
2.5 Cluster
1, need to use the script
In the sbin directory
Start start-dfs.sh stop-dfs.sh Start start-yarn.sh stop-yarn.sh on the Node where Namenode is locatedCopy the code
2. How does Hadoop know which machines to start the corresponding Namenode, SecondNamenode, Datanode, Nodemanager, and Resourcemanager
Namenode, SecondNamenode, and Resourcemanager have been specified in the configuration file name.
The remaining DN and nm need to be told to start by worker file. Hadoop3 requires the Workers file to configure which nodes to start datanode and Nodemanager.
Vim/opt/module/hadoop - 3.1.3 / etc/hadoop/workersCopy the code
Add IP addresses and distribute them to each node.
hadoop101
hadoop102
hadoop103
Copy the code
3. Start the cluster
1) Start HDFS: Run the startup and shutdown program on the node where Namenode resides.
start-dfs.sh stop-dfs.sh
Copy the code
2). Start and close YARN: Run start-yarn.sh on the Resourcemanager node
start-yarn.sh stop-yarn.sh
Copy the code
4. Manage scripts
Go to /home/leip/bin and create a script named mycluster.sh to start and stop clusters
#! /bin/bash if [ $# -lt 1 ] then echo Not Enough Arguement exit; Fi case $1 in the "start") echo "= = = = = = = = start Hdfs = = = = = = = = = = = =" SSH hadoop101 / opt/sofeware/hadoop - 3.1.3 / sbin/start - DFS. Sh Echo "= = = = = = = = start yarn = = = = = = = = = = =" SSH hadoop102 / opt/sofeware/hadoop - 3.1.3 / sbin/start - yarn. Sh;; "Stop") echo "= = = = = = = = stop yarn = = = = = = = = = = = =" SSH hadoop102 / opt/sofeware/hadoop - 3.1.3 / sbin/stop - yarn. Sh echo "= = = = = = = = stop HDFS = = = = = = = = = = =" SSH hadoop101 / opt/sofeware/hadoop - 3.1.3 / sbin/stop - DFS. Sh;; *) echo "Input args error" ;; esacCopy the code
Create the myjps.sh script to view the JPS script of each node without switching to other directories.
#! /bin/bash for i in hadoop101 hadoop102 hadoop103 do echo "===== $i jps======" ssh $i /opt/sofeware/java8/bin/jps doneCopy the code
After assigning permissions, look at the directory and you now have three scripts:
[lei@hadoop101 bin]$ll Total usage 12-rwxrwxrwx 1 Lei lei 581 March 17 21:25 mycluster.sh -rwxrwxrwx 1 Lei lei 126 March 17 21:27 Sh -rwxrwxrwx. 1 lei lei 613 March 15 21:23 xsyncCopy the code
Group test:
[lei@hadoop101 hadoop]$ mycluster.sh start ========Start Hdfs============ Starting namenodes on [hadoop101] Starting datanodes Starting secondary namenodes [hadoop103] ========start yarn=========== Starting resourcemanager Starting nodemanagers [lei@hadoop101 hadoop]$ myjps.sh ===== hadoop101 jps====== 41730 NodeManager 42325 Jps 40696 NameNode 40911 DataNode ===== hadoop102 jps====== 30074 Jps 29099 ResourceManager 29326 NodeManager 28383 DataNode ===== hadoop103 jps====== 23360 NodeManager 22962 SecondaryNameNode 22547 DataNode 24025 JpsCopy the code
2.6 Configuring a History Server
To view the historical running status of the program, you need to configure the history server. The specific configuration is as follows:
1) Configure mapred-site.xml
Add the following configuration to the file:
<! -- history server address - > < property > < name > graphs. The jobhistory. Address < / name > < value > hadoop101:10020 < value > / < / property > <! Both -- history server web address - > < property > < name > graphs. The jobhistory. Webapp. Address < / name > < value > hadoop101:19888 < / value > </property>Copy the code
2) Distribute the configuration
xsync $HADOOP_HOME/etc/hadoop/mapred-site.xml
Copy the code
3) Start and close the history server
mapred --daemon start historyserver
mapred --daemon stop historyserver
Copy the code
4) View JobHistory
http://hadoop101:19888/jobhistory
Copy the code
2.7 Configuring the Log Aggregation Function
Concept of log aggregation: After an application is run, program logs are stored in the HDFS for the last time.
Benefits of the aggregation function: you can easily view the details of the program running, convenient development and debugging.
Note: To enable log gathering, restart NodeManager, ResourceManager, and HistoryManager.
1. Configure yarn-site. XML
<property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log.server.url</name> <value>http://hadoop101:19888/jobhistory/logs</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>604800</value> </property>Copy the code
2. Distribute configuration
xsync $HADOOP_HOME/etc/hadoop/yarn-site.xml
Copy the code
3. Restart the cluster
You can see the history server in action:
2.8 Synchronizing cluster time
Time synchronization can be performed in two ways:
1, through the network for clock synchronization (must ensure that virtual function connected to the Internet)
Ntpdate is installed on all three machines
yum -y install ntpdate
Copy the code
Ali Cloud clock synchronization server
sudo ntpdate ntp4.aliyun.com
Copy the code
Three machines have timed tasks
crontab -e
Copy the code
Add the following:
*/1 * * * * /usr/sbin/ntpdate ntp4.aliyun.com;
Copy the code
2, find a machine as a time server, all machines and this machine time synchronization. (I’m not going to talk about that.)