Big Data Learning Road (3) : Fully distributed construction of Hadoop

I. Environment configuration

Three VMS are available.

Because a cluster is made up of multiple servers, when design-to-configuration changes are made, the configuration of the cluster is required to be synchronized, so the same changes need to be made on each machine. When there are many servers in a cluster, the repetitive work is meaningless and cumbersome, so cluster distribution scripts are required.

1.1 SCP Security Copy

SCP can copy data from server to server. The basic syntax is as follows:

Command recursively to copy the file path/name User @ host objective: the purpose of path/name [root @ xiaolei module] # SCP - r/etc/profile. D/my_env. Sh root @ hadoop104: / etc/profile. D/aCopy the code

1.2 XSync Cluster distribution script

The difference between rsync and SCP: Using rsync to copy files is faster than using SCP. Rsync only updates different files. SCP is to copy all files over. Basic syntax:

Rsync -av $pdir/$fname $user@hadoop$host:$pdir/$fname Command Optional Parameters Path/name of the file to be copied Destination user@ host: destination path/name The options are described as follows: -a: archive copy -v: displays the replication processCopy the code

Use xsync to name distribution scripts:

Note the problem, in the /etc/profile file add path: /home/leip/bin as follows, then write the script here to achieve global access.

export PATH=$PATH:/home/lei/bin
Copy the code

Then create the bin directory under /home/lei and create the xsync file below:

[lei@xiaolei module]# cd /home/leilei
[lei@xiaolei leilei]# ls
[lei@xiaolei leilei]# mkdir bin
[lei@xiaolei leilei]# touch xsync
[lei@xiaolei leilei]# vim xsync
Copy the code

Script content:

#! If [$# -lt 1] then echo Not Enough Arguement! exit; Fi # 2. Traversal cluster all machines for host in hadoop102 hadoop103 do echo = = = = = = = = = = = = = = = = = = = = $host = = = = = = = = = = = = = = = = = = = = # 3. If [-e $file] then #5. If [-e $file] then #5 Pdir =$(CD -p $(dirname $file); Fname =$(basename $file) SSH $host "mkdir -p $pdir" rsync -av $pdir/$fname $host:$pdir else echo $file  does not exists! fi done doneCopy the code

Testing: Scripts can be used.

[root@xiaolei bin]# xsync /home/leilei/bin/xsync ==================== hadoop101 ==================== root@hadoop101's password: root@hadoop101's password: Sending Incremental file list xsync sent 693 bytes received 35 bytes 208.00 bytes/ SEC Total size is 602 Speedup is 0.83Copy the code

Second, distributed cluster construction

2.1 Cluster Planning

Three machines: HadoOP101, HadoOP102, Hadoop103.

The user name is LEI. The user has the permission of the root user.

Resource allocation:

NameNode, Resourcemanager, and SecondarynameNode require more resources. Therefore, allocate these resources to different machines.

Datanodes are responsible for data storage. Because all three machines have storage resources, all three machines must have Datanodes. Nodemanager is responsible for resource management, because all three machines have computing resources, so all three need Nodemanager.

hadoop101:   NameNode        DataNode  Nodemanager
hadoop102:  Resourcemanager  DataNode  Nodemanager
hadoop103: SecondarynameNode DataNode  Nodemanager
Copy the code

2.2 Cluster Construction

1. Delete the data and logs directories under the Hadoop installation directory on each node. This is the previous record.

2. Configure javahome in /etc/hadoop/hadoop-env. sh

Configure core-site: define a path variable

<! -- <property> <name>fs.defaultFS</name> <value> HDFS ://hadoop101:9820</value> </property> <! Dir in the official configuration file is hadoop.tmp.dir, which is used to specify the hadoop data storage directory. The hadoop.data.dir used in this configuration is a self-defined variable. Because the value of this configuration is used in hdFS-site.xml to specify the directory where namenode and Datanode store data --> <property> <name>hadoop.data.dir</name> < value > / opt/module/hadoop - 3.1.3 / data value > < / < / property >Copy the code

4. Configure hdFS-site

Manually specify the storage location

<! Replication </name> <value>1</value> </property> <! Dir </name> <value>file://${hadoop.data.dir}/name</value> </property> <! Datanode.data. dir</name> <value>file://${hadoop.data.dir}/data</value> </property> <! - specifies SecondaryNameNode data storage directory - > < property > < name > DFS. The namenode. Checkpoint. Dir < / name > <value>file://${hadoop.data.dir}/namesecondary</value> </property> <! Datanode-restart. Timeout </name> <value>30s</value> </property> <! --> <property> <name>dfs.namenode.http-address</name> <value>hadoop101:9870</value> </property> <! - 2 nn web side access address - > < property > < name > DFS. The namenode. Secondary. HTTP - address < / name > < value > hadoop103:9868 < value > / < / property >Copy the code

5. Configure yarn-site

<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <! - specifies the address of the ResourceManager - > < property > < name > yarn. The ResourceManager. The hostname < / name > < value > hadoop102 < value > / < / property > <! Env --> <property> <name> yarn.nodeManager. env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP _MAPRED_HOME</value> </property> <! Vmem-check-enabled </name> <value>false</value> </property>Copy the code

6. Configure mapReduce mapred-site

Mapreduce.framework. name</name> <value>yarn</value> </property> <name>mapreduce.application.classpath</name> <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value> </property>Copy the code

7. Synchronize the configuration

Xsync/opt/module/hadoop3.1.3Copy the code

2.3 Cluster Single Point Of Startup

1. If the cluster is started for the first time, format the nameNode

hdfs namenode -format
Copy the code

2. Start HDFS

Start nameNode on Hadoop101

hdfs --daemon start namenode
Copy the code

Start datanode on Hadoop101, Hadoop102, and Hadoop103

hdfs --daemon start datanode
Copy the code

Start Secondarynamenode at Hadoop103

hdfs --daemon start secondarynamenode
Copy the code

http://hadoop103:9868/status.html

http://hadoop101:9870/dfshealth.html

3. Start YARN

Start Resourcemanager on Hadoop102

yarn --daemon start resourcemanager
Copy the code

Start NodeManager on Hadoop101, 102, 103

yarn --daemon start nodemanager
Copy the code

http://hadoop101:8088/cluster

4. Cluster testing

A. Create a /user/lei/input directory on the HDFS

[root@xiaolei hadoop-3.1.3]# hdfs dfs -mkdir -p /user/leilei/input
Copy the code

B. Upload local files to the created directory:

[root@xiaolei hadoop-3.1.3]# hdfs dfs -put wcinput/wc.input  /user/leilei/input
Copy the code

C. How to view the specific storage in HDFS? There are three default copies of Hadoop. If one is configured, its location is random. HDFS file management system.

D. Test YARN

[lei @ hadoop103 hadoop - 3.1.3] $hadoop jar share/hadoop/graphs/hadoop - graphs - examples - 3.1.3. Jar wordcount /user/lei/input /user/lei/outputCopy the code

E, problem,If this problem occurs, there are two solutions. The first solution is to directly modify the configuration file and add HDFS configuration as follows:

<property>
    <name>dfs.permissions.enabled</name>
    <value>false</value>
</property>
Copy the code

This is not recommended for use in production environments.

The second option is to add directory permissions:

[lei@hadoop101 hadoop-3.1.3]$bin/ HDFS dfs-chmod -r 777 /Copy the code

2.4 SSH Encryption-free Login

1. Existing problems:

The startup and shutdown of a cluster are now operated by a single point of start and shutdown. What if there are many machines in the cluster?

2. Problem-solving ideas

You want to write a script for starting and shutting down a cluster

The general idea of the script:

3. How do I log in to a remote machine

SSH IP/host name

[root@xiaolei hadoop-3.1.3]# SSH hadoop101 root@hadoop101's password: Last login: Wed Aug 26 10:41:51 2020 [root@hadoop101 ~]# exit Connection to hadoop101 closedCopy the code

4. Solve the problem that you need to enter a password for SSH login to the remote machine

Use the public and private keys for encryption-free login. Results: HaDOOP101, HaDOOP102, HaDOOP103.

[lei@hadoop101 /]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/lei/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/lei/.ssh/id_rsa. Your public key has been saved in /home/lei/.ssh/id_rsa.pub. The key fingerprint is: SHA256:73UcABIdwg3Vvosv8YTfiZbwnjf+p8kubOPcwZk/ST0 lei@hadoop101 The key's randomart image is: +---[RSA 2048]----+ | .=*+o | | .oo.. | |.. | |.. | | S . .. .| | .+ oo E.| | .X.=O.o| | .oo&==*.| | .B**B+=| +----[SHA256]-----+ [lei@hadoop101 /]$Copy the code

Authorization of other machines without confidentiality:

ssh-copy-id hadoop101
ssh-copy-id hadoop102
ssh-copy-id hadoop103
Copy the code

Similarly, other machines can perform similar operations to complete mutual non-secret login.

2.5 Cluster

1, need to use the script

In the sbin directory

Start start-dfs.sh stop-dfs.sh Start start-yarn.sh stop-yarn.sh on the Node where Namenode is locatedCopy the code

2. How does Hadoop know which machines to start the corresponding Namenode, SecondNamenode, Datanode, Nodemanager, and Resourcemanager

Namenode, SecondNamenode, and Resourcemanager have been specified in the configuration file name.

The remaining DN and nm need to be told to start by worker file. Hadoop3 requires the Workers file to configure which nodes to start datanode and Nodemanager.

Vim/opt/module/hadoop - 3.1.3 / etc/hadoop/workersCopy the code

Add IP addresses and distribute them to each node.

hadoop101
hadoop102
hadoop103
Copy the code

3. Start the cluster

1) Start HDFS: Run the startup and shutdown program on the node where Namenode resides.

start-dfs.sh    stop-dfs.sh
Copy the code

2). Start and close YARN: Run start-yarn.sh on the Resourcemanager node

start-yarn.sh   stop-yarn.sh
Copy the code

4. Manage scripts

Go to /home/leip/bin and create a script named mycluster.sh to start and stop clusters

#! /bin/bash if [ $# -lt 1 ] then echo Not Enough Arguement exit; Fi case $1 in the "start") echo "= = = = = = = = start Hdfs = = = = = = = = = = = =" SSH hadoop101 / opt/sofeware/hadoop - 3.1.3 / sbin/start - DFS. Sh Echo "= = = = = = = = start yarn = = = = = = = = = = =" SSH hadoop102 / opt/sofeware/hadoop - 3.1.3 / sbin/start - yarn. Sh;; "Stop") echo "= = = = = = = = stop yarn = = = = = = = = = = = =" SSH hadoop102 / opt/sofeware/hadoop - 3.1.3 / sbin/stop - yarn. Sh echo "= = = = = = = = stop HDFS = = = = = = = = = = =" SSH hadoop101 / opt/sofeware/hadoop - 3.1.3 / sbin/stop - DFS. Sh;; *) echo "Input args error" ;; esacCopy the code

Create the myjps.sh script to view the JPS script of each node without switching to other directories.

#! /bin/bash for i in hadoop101 hadoop102 hadoop103 do echo "===== $i jps======" ssh $i /opt/sofeware/java8/bin/jps doneCopy the code

After assigning permissions, look at the directory and you now have three scripts:

[lei@hadoop101 bin]$ll Total usage 12-rwxrwxrwx 1 Lei lei 581 March 17 21:25 mycluster.sh -rwxrwxrwx 1 Lei lei 126 March 17 21:27 Sh -rwxrwxrwx. 1 lei lei 613 March 15 21:23 xsyncCopy the code

Group test:

[lei@hadoop101 hadoop]$ mycluster.sh start ========Start Hdfs============ Starting namenodes on [hadoop101] Starting datanodes Starting secondary namenodes [hadoop103] ========start yarn=========== Starting resourcemanager Starting nodemanagers [lei@hadoop101 hadoop]$ myjps.sh ===== hadoop101 jps====== 41730 NodeManager 42325 Jps 40696 NameNode 40911  DataNode ===== hadoop102 jps====== 30074 Jps 29099 ResourceManager 29326 NodeManager 28383 DataNode ===== hadoop103 jps====== 23360 NodeManager 22962 SecondaryNameNode 22547 DataNode 24025 JpsCopy the code

2.6 Configuring a History Server

To view the historical running status of the program, you need to configure the history server. The specific configuration is as follows:

1) Configure mapred-site.xml

Add the following configuration to the file:

<! -- history server address - > < property > < name > graphs. The jobhistory. Address < / name > < value > hadoop101:10020 < value > / < / property > <! Both -- history server web address - > < property > < name > graphs. The jobhistory. Webapp. Address < / name > < value > hadoop101:19888 < / value > </property>Copy the code

2) Distribute the configuration

xsync $HADOOP_HOME/etc/hadoop/mapred-site.xml
Copy the code

3) Start and close the history server

mapred --daemon start historyserver
mapred --daemon stop  historyserver
Copy the code

4) View JobHistory

http://hadoop101:19888/jobhistory
Copy the code

2.7 Configuring the Log Aggregation Function

Concept of log aggregation: After an application is run, program logs are stored in the HDFS for the last time.

Benefits of the aggregation function: you can easily view the details of the program running, convenient development and debugging.

Note: To enable log gathering, restart NodeManager, ResourceManager, and HistoryManager.

1. Configure yarn-site. XML

<property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log.server.url</name> <value>http://hadoop101:19888/jobhistory/logs</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>604800</value> </property>Copy the code

2. Distribute configuration

xsync $HADOOP_HOME/etc/hadoop/yarn-site.xml
Copy the code

3. Restart the cluster

You can see the history server in action:

2.8 Synchronizing cluster time

Time synchronization can be performed in two ways:

1, through the network for clock synchronization (must ensure that virtual function connected to the Internet)

Ntpdate is installed on all three machines

yum -y install ntpdate
Copy the code

Ali Cloud clock synchronization server

sudo ntpdate ntp4.aliyun.com
Copy the code

Three machines have timed tasks

crontab -e
Copy the code

Add the following:

*/1 * * * * /usr/sbin/ntpdate ntp4.aliyun.com;
Copy the code

2, find a machine as a time server, all machines and this machine time synchronization. (I’m not going to talk about that.)