Cluster planning

The port that

Port name Hadoop2.X Hadoop.x
NameNode Internal communication port 8020/9000 8020/9000/9820
NameNode HTTP UI 50070 9870
MapReduce View the task end 8088 8088
History server communication port 19888 19888

Create a container

a hadoop02 hadoop03 hadoop04 Hadoop04
HDFS The NameNode, DataNode DataNode SecondaryNameNode, DataNode
Yarn NodeManager The ResourceManager, NodeManager NodeManager
Port 19888,9870 8088 9868
HDFS namenode-format and start-dfs.sh start-yarn.sh

Start executing command

Create a network

Docker network create -d bridge net --subnet=172.18.0.0/24 --gateway=172.18.0.1Copy the code

hadoop01

Docker run-itd --privileged=true --name hadoop01 --net net -- IP 172.18.0.11 guozhenhua/hadoop:4 /usr/sbin/initCopy the code

Hadoop02

Docker run-ITd --privileged=true --name hadoop02 --net net -- IP 172.18.0.12 -p 9870:9870-p 19888:19888 guozhenhua/hadoop:3 /usr/sbin/initCopy the code

Hadoop03

Docker run-itd --privileged=true --name hadoop03 --net net -- IP 172.18.0.13 -p 8088:8088 Guozhenhua/Hadoop :3 /usr/sbin/initCopy the code

Hadoop04

Docker run-itd --privileged=true --name hadoop04 --net net -- IP 172.18.0.14 -p 9868:9868 Guozhenhua/Hadoop :3 /usr/sbin/initCopy the code

On the hadoop02

  • To switch users, run the su hadoop command
  • Run HDFS namenode-format
  • Execution: start – DFS. Sh

On the hadoop03

  • To switch users, run the su hadoop command
  • Execution: start – yarn. Sh

test

  • hadoop fs -mkdir /input

  • To access the page: http://127.0.0.1:9870/explorer.html#/input

Review of setup Process

1. Build the environment

  • Download centos:7 containers.

    docker pull centos:7
    Copy the code
  • Create a user

    useradd hadoop
    passwd hadoop
    Copy the code
  • Configure the JAVA environment.

    Download the JDK and configure JAVA_HOMECopy the code
  • Configure the Hadoop environment.

    Download hadoop can: https://ftp.jaist.ac.jp/pub/apache/hadoop/common/hadoop-3.2.2/hadoop-3.2.2.tar.gzCopy the code
  • Example Set SSH password-free login

    By default, SSH login does not exist. Therefore, you need to install the openssh service yum install install openssh-client openssh-server to modify the vim /etc/ssh/sshd_config configurationCopy the code
  • Configure /etc/hosts based on the cluster plan

    172.18.0.11 hadoop01 172.18.0.12 hadoop02 172.18.0.13 hadoOP03 172.18.0.14 hadoop04 Based on the IP addressCopy the code
  • Configure the Hadoop configuration

    Configure the core – site. The XML

    <configuration>
    <! -- Specify the address of the NameNode
       <property>
          <name>fs.defaultFS</name>
          <value>hdfs://hadoop02:8020</value>
       </property>
      <! Hadoop data storage directory -->
      <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/data</value>
      </property>
      <! -- Configure static user for HDFS web login as -->
       <property>
      <name>hadoop.http.staticuser.user</name>
      <value>hadoop</value>
       </property>
    </configuration>
    Copy the code

    Configuration HDFS – site. XML

    <configuration>
    	<property>
    	     <name>dfs.namenode.http-address</name>
            	<value>hadoop02:9870</value>
           </property>
    <! -- 2nn Web access address -->
            <property>
            <name>dfs.namenode.secondary.http-address</name>
                  <value>hadoop04:9868</value>
           </property>
    
    </configuration>
    Copy the code

    Configuration of yarn – site. XML

    <configuration>
    
    <! -- Site specific YARN configuration properties -->
    
    <! Shuffle -->
     <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value> </property>
    <! -- Specify the ResourceManager address -->
     <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop03</value>
     </property>
    <! Inheritance of environment variables -->
        <property>
          <name>yarn.nodemanager.env-whitelist</name>
         <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CO NF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAP RED_HOME</value>
       </property>
    </configuration>
    Copy the code

    Configuration mapred – site. XML

    <configuration>
    <property>
    <name>mapreduce.framework.name</name>
           <value>yarn</value>
       </property>
    </configuration>
    Copy the code

    Configuration of workers

    hadoop02
    hadoop03
    hadoop04
    Copy the code

Docker – compose configuration

Docker-compose can be used directly to start the container.

After starting the container, start the Hadoop cluster

On the hadoop02

  • To switch users, run the su hadoop command
  • Run HDFS namenode-format
  • Execution: start – DFS. Sh

On the hadoop03

  • To switch users, run the su hadoop command

  • Execution: start – yarn. Sh

version: "3"

services:
  hadoop02:
    image: guozhenhua/hadoop:5
    container_name: hadoop02
    privileged: true
    ports:
      - 9870: 9870
    networks:
      net1:
        ipv4_address: 172.18. 012.

  hadoop03:
    image: guozhenhua/hadoop:5
    container_name: hadoop03
    privileged: true
    ports:
      - 8088: 8088
    networks:
      net1:
        ipv4_address: 172.18. 013.
      
  hadoop04:
    image: guozhenhua/hadoop:5
    container_name: hadoop04
    privileged: true
    ports:
      - 9868: 9868
    networks:
      net1:
        ipv4_address: 172.18. 014.

networks:
  net1:
    driver: bridge
    ipam:
      config:
        - subnet: 172.18. 0. 0/ 24
          gateway: 172.18. 01.

Copy the code

Note:

  • The **IP address and container_name ** in the configuration file cannot be changed arbitrarily. You need to modify /etc/hosts in the container.
  • Privileged: true This is necessary, otherwise the SSHD service cannot be started, and the cluster cannot be connected and logged in during startup.