Installation environment

VMware® Workstation 16 Pro

Vm OS: CentOS 7.9-minimal

Vm IP addresses:,, and

The early stage of the plan

A Hadoop cluster consists of two clusters: HDFS cluster and YARN cluster. The two clusters are logically separated, but usually share hosts.

Both clusters are standard master-slave architecture clusters.

Roles (daemons) in the HDFS cluster:

  • Primary role: NameNode
  • Secondary role: DataNode
  • Primary Role Secondary role: SecondaryNameNode

Roles (daemons) in the YARN cluster:

  • Active role: ResourceManager
  • Role: NodeManager

Cluster planning:

The server The IP address Running roles (daemons) NameNode DataNode ResourceManager NodeManager SecondaryNameNode DataNode NodeManager DataNode NodeManager

Environment configuration

This parameter must be configured for each VM as user root.

1. Disable the firewall

systemctl stop firewalld
systemctl disable firewalld
2. Synchronize time

yum -y install ntpdate
3. Configure the host name

vi /etc/hostname
Set the host names of the three VMS to,, and

4. Configure the hosts file

vi /etc/hosts
Copy the code

Add the following: node1 node2 node3
5. Install JDK

Yum -y install java-1.8.0-openJDK java-1.8.0-openjdk-develCopy the code

Configuration JAVA_HOME

cat <<EOF | tee /etc/profile.d/
export JAVA_HOME=\$(dirname \$(dirname \$(readlink \$(readlink \$(which javac)))))
export PATH=\$PATH:\$JAVA_HOME/bin
source /etc/profile.d/
Copy the code


6. Create a Hadoop user and set a password

adduser hadoop
usermod -aG wheel hadoop
passwd hadoop
Create a directory for storing data locally in HDFS:

mkdir /home/hadoop/data
chown hadoop: /home/hadoop/data
7. Configure environment variables

Echo 'export HADOOP_HOME=/home/hadoop/hadoop-3.3.2' >> /etc/profile echo 'export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin' >> /etc/profile source /etc/profileCopy the code

8. Configure SSH

yum install openssh
Switch to the Hadoop user and run the following command.

ssh-copy-id node1
ssh-copy-id node2
ssh-copy-id node3
Perform this operation on each VM as follows:

[hadoop@node1 ~]$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): Created directory '/home/hadoop/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/hadoop/.ssh/id_rsa. Your public key has been saved in /home/hadoop/.ssh/ The key fingerprint is: SHA256:gFs4NEpc6MIVv7/r5f2rUFdOi7ht11GceM3fd/Uq/nU [email protected] The key's randomart image is: +---[RSA 2048]----+ | .. += | | .o+.+ .oo| |.. o +.o . =*| |... +.. . * B| | . .. S o o +*| | . . + .=| | . o .. o.. E| | + o...... . | | +.. o++o | +----[SHA256]-----+ [hadoop@node1 ~]$ ssh-copy-id node1 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/" The authenticity of host 'node1 (' can't be established fingerprint is SHA256:BxdxJ5ONWI6xkPrFWxy9MIFs/B3IpEgjhFxiwI6KOLU. ECDSA key fingerprint is MD5:78:ea:2d:36:7e:eb:83:47:8f:61:c6:70:b6:0f:20:d6. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys hadoop@node1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'node1'" and check to make sure that only the key(s) you wanted were added. [hadoop@node1 ~]$ ssh-copy-id node2 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/" The authenticity of host 'node2 (' can't be established fingerprint is SHA256:BxdxJ5ONWI6xkPrFWxy9MIFs/B3IpEgjhFxiwI6KOLU. ECDSA key fingerprint is MD5:78:ea:2d:36:7e:eb:83:47:8f:61:c6:70:b6:0f:20:d6. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys hadoop@node2's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'node2'" and check to make sure that only the key(s) you wanted were added. [hadoop@node1 ~]$ ssh-copy-id node3 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/" The authenticity of host 'node3 (' can't be established fingerprint is SHA256:BxdxJ5ONWI6xkPrFWxy9MIFs/B3IpEgjhFxiwI6KOLU. ECDSA key fingerprint is MD5:78:ea:2d:36:7e:eb:83:47:8f:61:c6:70:b6:0f:20:d6. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys hadoop@node3's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'node3'" and check to make sure that only the key(s) you wanted were added. [hadoop@node1 ~]$Copy the code

Download and install

Install and configure the vm on Node1 and copy the installed directory to the other two VMS. (Hadoop user)

1. Download and unpack

Connect to node1 as user Hadoop and run the following command to download the installation package to the /home/hadoop directory.

CD/home/hadoop curl - Ok to the code


The tar ZXF hadoop - 3.3.2 rainfall distribution on 10-12. Tar. GzCopy the code

Next, Hadoop is configured through a configuration file.

Hadoop configuration files fall into three categories:

  • Default configuration files – includingcore-default.xml,hdfs-default.xml,yarn-default.xmlmapred-default.xmlThese files are read-only and hold the default values of the parameters.
  • Custom configuration files — includesetc/hadoop/core-site.xml,etc/hadoop/hdfs-site.xml,etc/hadoop/yarn-site.xmletc/hadoop/mapred-site.xmlIs used to store custom configuration information and will override the default configuration.
  • Environment configuration files – includesetc/hadoop/,etc/hadoop/mapred-env.shetc/hadoop/yarn-env.shThese files are used to configure the Java runtime environment for each daemon.

2. Configure the hadoop file

CD hadoop - 3.3.2 rainfall distribution on 10-12 vi/etc/hadoop/hadoop - env. ShCopy the code

Add the following:

export HDFS_NAMENODE_USER=hadoop
export HDFS_DATANODE_USER=hadoop
At a minimum, you need to configure the JAVA_HOME environment variable, and you can configure it separately for different daemons by using the following variables:

daemon The environment variable

For example, use parallelGC and 4GB heap memory for Namenode configuration:

export HDFS_NAMENODE_OPTS="-XX:+UseParallelGC -Xmx4g"
3. Configure core-site. XML

This file will override the configuration in core-default.xml.

vi etc/hadoop/core-site.xml
Copy the code

Add the following:

<! -- Set default file system Hadoop -- set default file system Hadoop -- set default file system Hadoop

<! -- Set Hadoop local path to save data -->

<! -- Set Hadoop Web UI user identity -->

<! Hive user agent setup -->

<! -- File trash can save time -->
4. Configure the HDFS -site. XML file

This file will overwrite the configuration in hdFS-default. XML.

vi etc/hadoop/hdfs-site.xml
Copy the code

Add the following:

<! -- Set SNN process location -->
5. Configure the mapred-site. XML file

This file will overwrite the configuration in mapred-default.xml.

vi etc/hadoop/mapred-site.xml
Copy the code

Add the following:

<! -- Set the default running mode of MR: YARN cluster mode, local local mode -->

<! -- MR program history service address -->

<! -- MR program history server web address -->



6. Configure the yarn-site. XML file

This file will overwrite the configuration in yarn-default. XML.

vi etc/hadoop/yarn-site.xml
Copy the code

Add the following:

<! Set the machine where the YARN cluster primary role runs -->


<! Physical memory limits for containers -->

<! Virtual memory limits for containers -->

<! -- Enable log aggregation -->

<! -- Set yarn history server address -->
7. Configure the Workers file

vi etc/hadoop/workers
Copy the code

Delete the original content and add the following:
8. Copy the configured installation packages to node2 and Node3 machines.

SCP -r /home/hadoop/hadoop-3.3.2 hadoop@node2: /home/hadoop/scp-r /home/hadoop/hadoop-3.3.2 hadoop@node3:/home/hadoop/Copy the code

Start the cluster

Hadoop provides two startup modes:

  • Start processes one by one using commands – Commands are manually executed on each machine, allowing precise control over the start of each process.
  • Use a script to start with one click — if you have configured SSH secret free logins between machines andetc/hadoop/workersFile.

Commands to start processes one by one:

# HDFS cluster
$HADOOP_HOME/bin/hdfs --daemon start namenode | datanode | secondarynamenode

# YARN cluster
$HADOOP_HOME/bin/yarn --daemon start resourcemanager | nodemanager | proxyserver
Scripts to start the cluster:

  • HDFS cluster –$HADOOP_HOME/sbin/start-dfs.shTo start all processes in the HDFS cluster.
  • YARN cluster –$HADOOP_HOME/sbin/start-yarn.shTo start all processes in the YARN cluster
  • Hadoop cluster –$HADOOP_HOME/sbin/start-all.shTo start all processes in the HDFS cluster and YARN cluster.

1. Format the file system

Before starting the cluster, you need to format the HDFS (only on the Node1 machine).

[hadoop@node1 ~]$ hdfs namenode -format
WARNING: /home/hadoop/hadoop-3.3.2/logs does not exist. Creating.
2022-03-17 23:22:55,296 INFO namenode.NameNode: STARTUP_MSG:
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = node1/
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 3.3.2
STARTUP_MSG:   classpath = /home/hadoop/hadoop-3.3.2/etc/hadoop:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/accessors-smart-2.4.7.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/animal-sniffer-annotations-1.17.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/asm-5.0.4.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/audience-annotations-0.5.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/avro-1.7.7.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/checker-qual-2.5.2.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-beanutils-1.9.4.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-codec-1.11.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-collections-3.2.2.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-compress-1.21.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-configuration2-2.1.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-daemon-1.0.13.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-io-2.8.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-lang3-3.12.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-logging-1.1.3.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-math3-3.1.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-net-3.6.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/commons-text-1.4.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/curator-client-4.2.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/curator-framework-4.2.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/curator-recipes-4.2.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/dnsjava-2.1.7.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/failureaccess-1.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/gson-2.8.9.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/guava-27.0-jre.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/hadoop-annotations-3.3.2.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/hadoop-auth-3.3.2.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/hadoop-shaded-guava-1.1.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/hadoop-shaded-protobuf_3_7-1.1.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/httpclient-4.5.13.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/httpcore-4.4.13.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/j2objc-annotations-1.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jackson-annotations-2.13.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jackson-core-2.13.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jackson-databind-2.13.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jakarta.activation-api-1.2.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/javax.servlet-api-3.1.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jaxb-api-2.2.11.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jcip-annotations-1.0-1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jersey-core-1.19.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jersey-json-1.19.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jersey-server-1.19.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jersey-servlet-1.19.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jettison-1.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jetty-http-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jetty-io-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jetty-security-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jetty-server-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jetty-servlet-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jetty-util-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jetty-util-ajax-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jetty-webapp-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jetty-xml-9.4.43.v20210629.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jsch-0.1.55.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/json-smart-2.4.7.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jsp-api-2.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jsr305-3.0.2.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jsr311-api-1.1.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/jul-to-slf4j-1.7.30.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerb-admin-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerb-client-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerb-common-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerb-core-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerb-crypto-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerb-identity-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerb-server-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerb-simplekdc-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerb-util-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerby-asn1-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerby-config-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerby-pkix-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerby-util-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/kerby-xdr-1.0.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/metrics-core-3.2.4.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/netty-3.10.6.Final.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/nimbus-jose-jwt-9.8.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/paranamer-2.3.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/re2j-1.1.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/slf4j-api-1.7.30.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar:/home/hadoop/hadoop-3.3.2/share/hadoop/common/lib/snappy-java-
STARTUP_MSG:   build = [email protected]:apache/hadoop.git -r 0bcb014209e219273cb6fd4152df7df713cbac61; compiled by 'chao' on 2022-02-21T18:39Z
STARTUP_MSG:   java = 1.8.0_322
2022-03-17 23:22:55,312 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2022-03-17 23:22:55,408 INFO namenode.NameNode: createNameNode [-format]
2022-03-17 23:22:55,800 INFO namenode.NameNode: Formatting using clusterid: CID-4271710c-605c-44fe-be87-6cbbcbb60338
2022-03-17 23:22:55,834 INFO namenode.FSEditLog: Edit logging is async:true
2022-03-17 23:22:55,870 INFO namenode.FSNamesystem: KeyProvider: null
2022-03-17 23:22:55,872 INFO namenode.FSNamesystem: fsLock is fair: true
2022-03-17 23:22:55,873 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
2022-03-17 23:22:55,886 INFO namenode.FSNamesystem: fsOwner                = hadoop (auth:SIMPLE)
2022-03-17 23:22:55,886 INFO namenode.FSNamesystem: supergroup             = supergroup
2022-03-17 23:22:55,886 INFO namenode.FSNamesystem: isPermissionEnabled    = true
2022-03-17 23:22:55,886 INFO namenode.FSNamesystem: isStoragePolicyEnabled = true
2022-03-17 23:22:55,886 INFO namenode.FSNamesystem: HA Enabled: false
2022-03-17 23:22:55,930 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2022-03-17 23:22:55,940 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
2022-03-17 23:22:55,941 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2022-03-17 23:22:55,944 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2022-03-17 23:22:55,944 INFO blockmanagement.BlockManager: The block deletion will start around 2022 Mar 17 23:22:55
2022-03-17 23:22:55,947 INFO util.GSet: Computing capacity for map BlocksMap
2022-03-17 23:22:55,947 INFO util.GSet: VM type       = 64-bit
2022-03-17 23:22:55,950 INFO util.GSet: 2.0% max memory 839.5 MB = 16.8 MB
2022-03-17 23:22:55,950 INFO util.GSet: capacity      = 2^21 = 2097152 entries
2022-03-17 23:22:55,959 INFO blockmanagement.BlockManager: Storage policy satisfier is disabled
2022-03-17 23:22:55,959 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false
2022-03-17 23:22:55,968 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.999
2022-03-17 23:22:55,968 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
2022-03-17 23:22:55,968 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
2022-03-17 23:22:55,969 INFO blockmanagement.BlockManager: defaultReplication         = 3
2022-03-17 23:22:55,969 INFO blockmanagement.BlockManager: maxReplication             = 512
2022-03-17 23:22:55,969 INFO blockmanagement.BlockManager: minReplication             = 1
2022-03-17 23:22:55,969 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
2022-03-17 23:22:55,969 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms
2022-03-17 23:22:55,969 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
2022-03-17 23:22:55,969 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
2022-03-17 23:22:55,996 INFO namenode.FSDirectory: GLOBAL serial map: bits=29 maxEntries=536870911
2022-03-17 23:22:55,996 INFO namenode.FSDirectory: USER serial map: bits=24 maxEntries=16777215
2022-03-17 23:22:55,996 INFO namenode.FSDirectory: GROUP serial map: bits=24 maxEntries=16777215
2022-03-17 23:22:55,996 INFO namenode.FSDirectory: XATTR serial map: bits=24 maxEntries=16777215
2022-03-17 23:22:56,023 INFO util.GSet: Computing capacity for map INodeMap
2022-03-17 23:22:56,023 INFO util.GSet: VM type       = 64-bit
2022-03-17 23:22:56,023 INFO util.GSet: 1.0% max memory 839.5 MB = 8.4 MB
2022-03-17 23:22:56,023 INFO util.GSet: capacity      = 2^20 = 1048576 entries
2022-03-17 23:22:56,024 INFO namenode.FSDirectory: ACLs enabled? true
2022-03-17 23:22:56,024 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true
2022-03-17 23:22:56,024 INFO namenode.FSDirectory: XAttrs enabled? true
2022-03-17 23:22:56,025 INFO namenode.NameNode: Caching file names occurring more than 10 times
2022-03-17 23:22:56,030 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536
2022-03-17 23:22:56,033 INFO snapshot.SnapshotManager: SkipList is disabled
2022-03-17 23:22:56,037 INFO util.GSet: Computing capacity for map cachedBlocks
2022-03-17 23:22:56,037 INFO util.GSet: VM type       = 64-bit
2022-03-17 23:22:56,037 INFO util.GSet: 0.25% max memory 839.5 MB = 2.1 MB
2022-03-17 23:22:56,037 INFO util.GSet: capacity      = 2^18 = 262144 entries
2022-03-17 23:22:56,047 INFO metrics.TopMetrics: NNTop conf: = 10
2022-03-17 23:22:56,047 INFO metrics.TopMetrics: NNTop conf: = 10
2022-03-17 23:22:56,047 INFO metrics.TopMetrics: NNTop conf: = 1,5,25
2022-03-17 23:22:56,051 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
2022-03-17 23:22:56,051 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2022-03-17 23:22:56,053 INFO util.GSet: Computing capacity for map NameNodeRetryCache
2022-03-17 23:22:56,053 INFO util.GSet: VM type       = 64-bit
2022-03-17 23:22:56,053 INFO util.GSet: 0.029999999329447746% max memory 839.5 MB = 257.9 KB
2022-03-17 23:22:56,053 INFO util.GSet: capacity      = 2^15 = 32768 entries
2022-03-17 23:22:56,080 INFO namenode.FSImage: Allocated new BlockPoolId: BP-571583129-
2022-03-17 23:22:56,101 INFO common.Storage: Storage directory /home/hadoop/data/dfs/name has been successfully formatted.
2022-03-17 23:22:56,128 INFO namenode.FSImageFormatProtobuf: Saving image file /home/hadoop/data/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
2022-03-17 23:22:56,226 INFO namenode.FSImageFormatProtobuf: Image file /home/hadoop/data/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 401 bytes saved in 0 seconds .
2022-03-17 23:22:56,241 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2022-03-17 23:22:56,259 INFO namenode.FSNamesystem: Stopping services started for active state
2022-03-17 23:22:56,260 INFO namenode.FSNamesystem: Stopping services started for standby state
2022-03-17 23:22:56,264 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid=0 when meet shutdown.
2022-03-17 23:22:56,264 INFO namenode.NameNode: SHUTDOWN_MSG:
SHUTDOWN_MSG: Shutting down NameNode at node1/
[hadoop@node1 ~]$
2. Start the HDFS cluster
This script will start the NameNode daemons and DataNode daemons:

[hadoop@node1 hadoop-3.3.2]$ Starting namenodes on [node1] Starting datanodes Warning Permanently added '' (ECDSA) to the list of known hosts. ssh: Could not resolve hostname Name or service not known ssh: Could not resolve hostname Name or service not known Starting secondary namenodes [node2] node2: WARNING: [hadoop@node1 hadoop-3.3.2]$[hadoop@node1 hadoop-3.3.2]$JPS. /home/ hadoop-3.3.2/logs does not exist 5001 DataNode 5274 Jps 4863 NameNode [hadoop@node1 hadoop-3.3.2]$Copy the code

After successful startup, you can access the Web interface of NameNode in the browser (default port: 9870) :

3. Start the YARN cluster
This script will start the ResourceManager daemon and NodeManager daemon:

[hadoop@node1 hadoop-3.3.2]$ Starting resourcemanager Starting nodemanagers SSH: Could not resolve hostname Name or service not known ssh: Could not resolve hostname Name or service not known [hadoop@node1 hadoop-3.3.2]$[hadoop@node1 hadoop-3.3.2]$JPS 5536 NodeManager 5395 ResourceManager 5001 DataNode 5867 Jps 4863 NameNode [hadoop@node1 hadoops-3.3.2]$Copy the code

After the ResourceManager is successfully started, you can access the Web UI of ResourceManager (default port: 8088) in the browser.

In addition to the and scripts, you can also use the script to start all Hadoop processes at one time.

Stop the cluster

As with starting a cluster, Hadoop provides two ways to stop a cluster.

Commands to terminate processes one by one:

# HDFS cluster
$HADOOP_HOME/bin/hdfs --daemon stop namenode | datanode | secondarynamenode

# YARN cluster
$HADOOP_HOME/bin/yarn --daemon stop resourcemanager | nodemanager | proxyserver
Stop cluster script:

  • HDFS cluster –$HADOOP_HOME/sbin/stop-dfs.shTo terminate all processes in the HDFS cluster.
  • YARN cluster –$HADOOP_HOME/sbin/stop-yarn.shTo stop all processes in the YARN cluster
  • Hadoop cluster –$HADOOP_HOME/sbin/stop-all.shTo stop all processes in the HDFS cluster and YARN cluster.

Run the script to stop all Hadoop processes at one time.

[hadoop@node1 hadoop-3.3.2]$ WARNING: Stopping all Apache Hadoop daemons as hadoop in 10 seconds. WARNING: Use CTRL-C to abort. Stopping namenodes on [node1] Stopping datanodes ssh: Could not resolve hostname Name or service not known ssh: Could not resolve hostname Name or service not known Stopping secondary namenodes [node2] Stopping nodemanagers ssh: Could not resolve hostname Name or service not known ssh: Could not resolve hostname Name or service not known Stopping Resourcemanager [hadoop@node1 hadoop-3.3.2]$Copy the code

