Hadoop uses Docker to build distributed systems

preparation

As you can see from my previous blog, I built a pseudo-distributed system, and this operation is based on that.

Modifying a Configuration File

Specify the nodeManager address and modify the yarn-site. XML file

<property>
    <description>The hostname of the RM.</description>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop0</value>
  </property>
Copy the code

Modify one of the hadoop configuration file paths in Hadoop0 to /etc/hadoop/Slaves and remove the original content, if any

hadoop1
hadoop2
Copy the code

Then copy the configuration files to the other two machines

  scp  -rq /usr/local/hadoop   hadoop1:/usr/local
  scp  -rq /usr/local/hadoop   hadoop2:/usr/local
Copy the code

Start the cluster

Perform sbin/start - all. ShCopy the code

To verify that the cluster is running properly, first check the processes: Hadoop0 requires these processes

[root@hadoop0 hadoop]# jps
4643 Jps
4073 NameNode
4216 SecondaryNameNode
4381 ResourceManager
Copy the code

These processes are required on Hadoop1

[root@hadoop1 hadoop]# jps
715 NodeManager
849 Jps
645 DataNode
Copy the code

These processes are required on Hadoop2

[root@hadoop2 hadoop]# jps
456 NodeManager
589 Jps
388 DataNode
Copy the code

Access cluster services through a browser

Since 50070 and 8088 were mapped to the corresponding port of the host computer when the container hadoop0 was started, the hadoop cluster service in the container can be directly accessed by the host computer. The IP of the host computer is the server of Huawei Cloud, so it is not convenient to show it.

conclusion

Yum update is too slow. This can be resolved by changing the source address

Wget -o /etc/ym.repos. D/centos-base. repo HTTP://mirrors.aliyun.com/repo/Centos-7.repo Repos. D/centos-base. Repo HTTP://mirrors.163.com/.help/CentOS7-Base-163.repo

yum clean all

yum makecache
Copy the code

2. Installing the JDK is prompt

/usr/local/java/jdk18.. 0_25/bin/java: cannot execute binary file
Copy the code

This is because huawei cloud provides arm64-bit Linuxarm JDK exclusive, you can go to the official website to download

3. The datenode fails to start because the ids of namenode and Datenode are inconsistent because the system has been formatted for many times. The most direct and effective way is to modify the Datanode namenodeID (in/hadoop/TMP/DFS/data/current/VERSION file) or modify the NameNode namespaceID (located in/hadoop/TMP/DFS/nam E /current/VERSION file) to make them consistent.

As for why clusterID is different: When format is used to format HDFS clusters, the clusterID of namenode is changed but the clusterID of Datanode is not changed. As a result, clusterids of Namenode and Datanode are inconsistent, and datanodes cannot be started.

You can also delete the TMP file and redistribute it, either way

4. Access localhost:50070. Namenode initialization of the default port fails, so you decide to manually modify the configuration file to set the default port hdfs-site. XML add the following:

<property>
  <name>dfs.http.address</name>
  <value>hadoop0:50070</value>
</property>
Copy the code

Hadoop uses Docker to build distributed systems

preparation

Modifying a Configuration File

conclusion

Related Posts

FastAPI starter series request!

Java Concurrent Programming – Block queue BlockingQueue

Learn Lombok quickly in Java Basics