There is nothing noble in being superior to others. The true nobility is in being superior to your previous self
Hadoop completely distributed environment construction
Write scripts to distribute files
The application scenarios are as follows: For example, there are three hosts: Master1, Slave1, and Slave2
If the resume is completely distributed in a cluster, the files need to be copied from master1 to slave
You can use rsync to distribute individual files, or you can use the following script to distribute folders or files
#! Pcount =$# if((pcount==0)); then echo no args; exit; Fname = 'basename $p1' echo fname= 'fname' pdir= 'CD -p $(dirname $p1); PWD 'echo pdir=$pdir #4 obtain the current user name user=' whoami '#5 rsync command can send files to specified host # host<3; host++)) do echo ------------------- slave$host -------------- rsync -rvl $pdir/$fname $user@slave$host:$pdir doneCopy the code
Cluster planning
hadoop102 | hadoop103 | hadoop104 | |
---|---|---|---|
HDFS | NameNodeDataNode | DataNode | SecondaryNameNode DataNode |
YARN | NodeManager | ResourceManager NodeManager |
NodeManager |
Configure the cluster
The configuration files are all under hadoop2.7.2/etc/hadoop
Configure the core – site. The XML
<! -- <property> <name>fs.defaultFS</name> <value> HDFS ://hadoop102:9000</value> </property> <! <property> <name>hadoop.tmp.dir</name> <value>/opt/module/hadoop-2.7.2/data/ TMP </value> </property>Copy the code
Configure hadoop – env. Sh
Export JAVA_HOME = / opt/module/jdk1.8.0 _144Copy the code
Configuration HDFS – site. XML
<! Replication </name> <value>3</value> </property> <! - specifies SecondaryNamenode host configuration - > < property > < name > DFS. The namenode. Secondary. HTTP - address < / name > <value>hadoop104:50090</value> </property>Copy the code
Configuration of yarn – env. Sh
Export JAVA_HOME = / opt/module/jdk1.8.0 _144Copy the code
Configuration of yarn – site. XML
<! Mapreduce_shuffle </value> </property> <name> map.nodeManager. aux-services</name> <value>mapreduce_shuffle</value> </property> <! - specifies the ResourceManager YARN address - > < property > < name > YARN. The ResourceManager. The hostname < / name > < value > hadoop103 value > < / </property>Copy the code
Configuration mapred – env. Sh
Export JAVA_HOME = / opt/module/jdk1.8.0 _144Copy the code
Configuration mapred – site. XML
Mapred-site.xml. Template is a copy of mapred-site.xml.
<! Mapreduce.framework. name</name> <value> YARN </value> </property>Copy the code
Distribute the configuration to the cluster
# xsync is the distribution script xsync /opt/module/hadoop-2.7.2/Copy the code
Cluster formatting
Hadoop namenode-format: hadoop namenode-format: hadoop namenode-format: Hadoop namenode-format: Hadoop namenode-format: Hadoop namenode-format: Hadoop namenode-format: Hadoop namenode-format: Hadoop namenode-format: Hadoop namenode-format: Hadoop namenode-format: Hadoop namenode-formatCopy the code
Cluster startup Test
Sh to start HDFS. Hadooop103 to start YARN, run start-yarn.sh to start yarn and use JPS to view the processCopy the code
The relevant data
GitHub: github.com/zhutiansama…
FocusBigData is the public account for this article
Reply [big data interview] [Big data interview experience] [Big data learning roadmap] there will be surprise oh