Premise: White Bear has three laptops and has all of them installed as Centos7.6. Here is the detailed installation process of the Hadoop cluster.

Install the ifconfig service

  • All three machines need to execute the following command
    • yum install -y net-tools.x86_64

Change the IP addresses of the three servers to static IP addresses

  • Change the configuration file to add the following

  • MyNode01

    # Laptop brush Linux, /etc/sysconfig/network-scripts/ifcfg-ens33 # BOOTPROTO="static" # ONBOOT = yes # BROADCAST=192.168.2.255 IPADDR=192.168.2.114 NETMASK=255.255.255.0 GATEWAY=192.168.2.1 Add DNS DNS1=192.168.2.1Copy the code
  • MyNode02

    # Laptop brush Linux, /etc/sysconfig/network-scripts/ifcfg-ens33 # BOOTPROTO="static" # ONBOOT = yes # BROADCAST=192.168.2.255 IPADDR=192.168.2.115 NETMASK=255.255.255.0 GATEWAY=192.168.2.1 Add DNS DNS1=192.168.2.1Copy the code
  • MyNode03

    # Laptop brush Linux, /etc/sysconfig/network-scripts/ifcfg-ens33 # BOOTPROTO="static" # ONBOOT = yes # BROADCAST=192.168.2.255 IPADDR=192.168.2.116 NETMASK=255.255.255.0 GATEWAY=192.168.2.1 Add DNS DNS1=192.168.2.1Copy the code
  • Restarting the Network Service

    • service network restart
  • The IP addresses of the three servers are:

    • 192.168.2.114
    • 192.168.2.115
    • 192.168.2.116

Close the firewall

  • All three machines need to be executed
    • systemctl disable firewalld.service

Close Selinux

  • All three machines need to be executed
    • vim /etc/selinux/config
    • SELINUX=disabled

5. Change the host name

  • 192.168.2.114
    • vim /etc/hostname
    • MyNode01
  • 192.168.2.115
    • vim /etc/hostname
    • MyNode02
  • 192.168.2.116
    • vim /etc/hostname
    • MyNode03

6. Add the mapping between host names and IP addresses

  • All three machines need to be executed
    • vim /etc/hosts
      192.168.2.114  MyNode01
      192.168.2.115  MyNode02
      192.168.2.116  MyNode03
      Copy the code

7. Add common users

  • All three machines need to be executed
    • Password 123456 for both root and regular users (test only)
    • Add the common user iceBear
      • useradd icebear
      • passwd 123456
    • For common users, add sudo permission at the end of the file
      • visudo
      • miller ALL=(ALL) ALL

8. Set up secret free login

  • It is best to set this parameter for both root and icebear users
  • All three machines need to be executed
    Su icebear go back to the root directory CD Ssh-keygen -t rsa # copy the public key to authorized_keys on MyNode01 CD. SSH ssh-copy-id MyNode01Copy the code
  • MyNode01 executed on machine
    • Copy the authorized_keys file to another machine
      cd /home/icebear/.ssh
      scp authorized_keys  MyNode02:$PWD
      scp authorized_keys  MyNode03:$PWD
      Copy the code
  • restart
    • All three machines need to be executed
    • reboot
  • Secret free login test
    • Connect all three machines separately, because the first connection needs to be confirmed
      su MyNode01
      ssh MyNode02
      ssh MyNode03
      Copy the code

9. Create an installation directory

  • Execute on all three machines (root user)
    # mkdir -p /home/bgd/soft # mkdir -p /home/bgd/install # /home/bgdCopy the code

Install the JDK

  • MyNode01 on machine (IceBear user)
    • Upload the JDK package to the soft folder on the MyNode01 machine
    • Unpack the
      CD /home/bg/soft # Decompress the package to the installation directory tar -zxf jdK-8u181-linux-x64.tar. gz -c /home/bg/install /Copy the code
    • Modifying environment Variables
      Sudo vim /etc/profile Configuring JDK export environment variable JAVA_HOME = / home/BGD/install/jdk1.8.0 _141 export PATH = : $JAVA_HOME/bin: $PATHCopy the code
    • Enable environment variables
      • source /etc/profile
    • Viewing the Java Version
      • java -version

Install and configure Hadoop

  • MyNode01 on machine (IceBear user)
    1. Upload the Hadoop installation package and decompress it
      • Tar -xzvf hadoop-2.6.0-cdh5.14.2.tar.gz -c /home/bg/install
    2. Configure hadoop environment variables
      Sudo vim JAVA_HOME/etc/profile # add the following content = / home/BGD/install/jdk1.8.0 _141 HADOOP_HOME = / home/BGD/install/hadoop - server - cdh5.14.2 PATH=$PATH:$HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin export JAVA_HOME export HADOOP_HOME export PATHCopy the code
    3. Enable environment variables
      • source /etc/profile
    4. View the Hadoop version information
      • hadoop version
    5. Configure hadoop – env. Sh
      • Sudo vim/home/BGD/install/hadoop - server - cdh5.14.2 / etc/hadoop/hadoop - env. Sh
      • Export JAVA_HOME = / home/BGD/install/jdk1.8.0 _141
    6. Configure the core – site. The XML
      • Sudo vim/home/BGD/install/hadoop - server - cdh5.14.2 / etc/hadoop/core - site. XML
        <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://MyNode01:8020</value> </property> <property> < name > hadoop. TMP. Dir < / name > < value > / home/BGD/install/hadoop - server - cdh5.14.2 hadoopDatas/tempDatas < value > / < / property > <! -- Buffer size, Size </name> <value>4096</value> </property> <property> <name>fs.trash.interval</name> <value>10080</value> <description> Number of minutes after the checkpoint was deleted. If it is zero, the trash can function will be disabled. This option can be configured on both the server and client. If the dumpster is disabled on the server side, check the client configuration. If the bin is enabled on the server side, the values configured on the server are used and the client configuration values are ignored. </description> </property> <property> <name>fs.trash.checkpoint.interval</name> <value>0</value> <description> Number of minutes between garbage checkpoints. It should be less than or equal to fs.trash.interval. If zero, set the value to the value of fs.trash.interval. Each time the check pointer runs, it creates a new checkpoint from the current one and removes checkpoints created earlier than fs.trash.interval. </description> </property> </configuration>Copy the code
    7. Configuration HDFS – site. XML
      • Sudo vim/home/BGD/install/hadoop - server - cdh5.14.2 / etc/hadoop/HDFS - site. XML
        <configuration> <! -- NameNode specifies the path where the metadata information is stored. In actual operation, the mount directory of the disk is determined first. -- Cluster dynamic offline <property> <name>dfs.hosts</name> < value > / home/BGD/install/hadoop - server - cdh5.14.2 / etc/hadoop/accept_host < value > / < / property > < property > < name > DFS. Hosts. Exclude < / name > < value > / home/BGD/install/hadoop - server - cdh5.14.2 / etc/hadoop/deny_host < value > / < / property > --> <property> <name>dfs.namenode.secondary.http-address</name> <value>MyNode02:50090</value> </property> <property> <name>dfs.namenode.http-address</name> <value>MyNode01:50070</value> </property> <property> <name>dfs.namenode.name.dir</name> < value > file:///home/bgd/install/hadoop-2.6.0-cdh5.14.2/hadoopDatas/namenodeDatas < value > / < / property > <! -- Define the node location of dataNode data storage. In actual work, generally determine the disk mount directory first, and then use multiple directories. <property> <name>dfs.datanode.data.dir</name> < value > file:///home/bgd/install/hadoop-2.6.0-cdh5.14.2/hadoopDatas/datanodeDatas < value > / < / property > < property > <name>dfs.namenode.edits.dir</name> < value > file:///home/bgd/install/hadoop-2.6.0-cdh5.14.2/hadoopDatas/dfs/nn/edits < value > / < / property > < property > <name>dfs.namenode.checkpoint.dir</name> < value > file:///home/bgd/install/hadoop-2.6.0-cdh5.14.2/hadoopDatas/dfs/snn/name < value > / < / property > < property > <name>dfs.namenode.checkpoint.edits.dir</name> < value > file:///home/bgd/install/hadoop-2.6.0-cdh5.14.2/hadoopDatas/dfs/nn/snn/edits < value > / < / property > < property > <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.blocksize</name> <value>134217728</value> </property> </configuration>Copy the code
    8. Configuration mapred – site. XML
      • CD/home/BGD/install/hadoop - server - cdh5.14.2 / etc/hadoop
      • cp mapred-site.xml.template mapred-site.xml
      • Sudo vim/home/BGD/install/hadoop - server - cdh5.14.2 / etc/hadoop/mapred - site. XML
        <! Mapreduce --> <configuration> <property> <name> mapReduce.framework. name</name> <value> YARN </value> </property> <property> <name>mapreduce.job.ubertask.enable</name> <value>true</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>MyNode01:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>MyNode01:19888</value> </property> </configuration>Copy the code
    9. Configuration of yarn – site. XML
      • Sudo vim/home/BGD/install/hadoop - server - cdh5.14.2 / etc/hadoop/yarn - site. XML
        <configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>MyNode01</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log.server.url</name> <value>http://MyNode01:19888/jobhistory/logs</value> </property> <! --> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>2592000</value><! --30 day--> </property> <! -- Time keeps user logs for a few seconds. Aggregation applies only if the log is disabled - > < property > < name > yarn. The nodemanager. Log. Retain - seconds < / name > < value > 604800 < / value > <! --7 day--> </property> <! <property> <name>yarn.nodemanager.log-aggregation.com press-type </name> <value>gz</value> </property> <! -- nodeManager Local file storage directory --> <property> <name>yarn.nodemanager.local-dirs</name> < value > / home/BGD/install/hadoop - server - cdh5.14.2 hadoopDatas/yarn/local < value > / < / property > <! -- Maximum number of completed tasks saved by resourceManager --> <property> <name>yarn.resourcemanager.max-completed-applications</name> <value>1000</value> </property> </configuration>Copy the code
    10. Edit the slaves
      • Sudo vim/home/BGD/install/hadoop - server - cdh5.14.2 / etc/hadoop/slaves
        MyNode01
        MyNode02
        MyNode03
        Copy the code
    11. Create a directory for storing files
      The mkdir -p/home/BGD/install/hadoop - server - cdh5.14.2 hadoopDatas/tempDatas mkdir -p / home/BGD/install/hadoop - server - cdh5.14.2 hadoopDatas/namenodeDatas mkdir -p / home/BGD/install/hadoop - server - cdh5.14.2 hadoopDatas/datanodeDatas mkdir -p / home/BGD/install/hadoop - server - cdh5.14.2 / hadoopDatas/DFS/nn/edits the mkdir -p / home/BGD/install/hadoop - server - cdh5.14.2 / hadoopDatas/DFS/SNN/name mkdir -p / home/BGD/install/hadoop - server - cdh5.14.2 / hadoopDatas/DFS/nn/SNN/editsCopy the code

Copy the configured Hadoop to another machine

  • MyNode01 machine Execution (IceBear user)
    • Delete the doc user document
      • CD/home/BGD/install/hadoop - server - cdh5.14.2 / share /
      • rm -rf doc/
    • Copy Hadoop to another machine
      • cd /home/bgd/
      • Sudo SCP -r hadoop-2.6.0-cdh5.14.2 MyNode02:$PWD
      • Sudo SCP -r hadoop-2.6.0-cdh5.14.2 MyNode03:$PWD
    • Copy global variables to other machines
      • sudo scp /etc/profile MyNode02:/etc/
      • sudo scp /etc/profile MyNode03:/etc/
  • Execute on all three machines (IceBear user)
    • Enable environment variables
      • source /etc/profile
    • Viewing the Hadoop Version
      • hadoop version

Add directory permissions for common users

  • All three machines execute
    • chown -R icebear:icebear /home/bgd
    • chmod -R 755 /home/bgd

Format Hadoop

  • MyNode01 on machine (IceBear user)
    • su icebear
    • hdfs namenode -format

Start/Stop the Hadoop cluster

  • MyNode01 on machine (IceBear user)
    • Start the
      • start-all.sh
    • stop
      • stop-all.sh

16. View the Web interface

  • Configure the hosts file on the user host (using the Mac OS as an example).
    • vim /etc/hosts
      192.168.2.114 MyNode01 192.168.2.115 MyNode02 192.168.2.116 MyNode03Copy the code
  • Type in the browser
    • http://mynode01:50070/dfshealth.html#tab-overview
  • If the hosts file is not configured on the user host
    • http://192.168.2.114:50070/dfshealth.html#tab-overview

Completion of this chapter