Recently, the project I am responsible for is preparing to use big data platform storage, which is mainly implemented around Hadoop platform. Although THE CDH version of Hadoop is planned to be used, the original Hadoop is used for development in order to facilitate development in the early stage, and then a better environment is prepared for expansion in the later stage.

Environment to prepare

The system environment of three servers is based on Centos7.6. If you need to use other users to perform operations, pay attention to permissions

Base machine allocation

Set up on three newly purchased servers. The server planning is as follows

hostname ip instructions
tidb1 192.168.108.66 The namenode and datanode
tidb2 192.168.108.67 The namenode and datanode
tidb3 192.168.108.68 The namenode and datanode

In the real environment, you are advised to separate Namenodes from Datanodes. Two namenodes are used as namenode nodes, and the other three are used as Datanodes.

The installation content of each machine is as follows:

tidb1 tidb2 tidb3
NameNode Square root Square root
DataNode Square root Square root Square root
ResourceManager Square root Square root
NodeManager Square root Square root Square root
Zookeeper Square root Square root Square root
journalnode Square root Square root Square root
zkfc Square root Square root

Above 3.0, we can install multiple NameNode nodes to ensure a higher high availability solution. But this is fine for basic test environment development, and more machine extensions can be extended here as well.

A firewall

All three machines need to do this

Before deploying the cluster, disable the firewall of the cluster. Otherwise, ports cannot be accessed during the deployment.

There are two types of firewalls in centos: Firewall and iptables. After 7.0, the default firewall is firewall, but there are two types of firewall policies in centos.

firewall

  1. Check the firewall status
[root@tidb1 sbin]# firewall-cmd --state
running
Copy the code
  1. Stopping the Firewall
systemctl stop firewalld.service
Copy the code
  1. Prohibit startup
systemctl disbale firewalld.service
Copy the code

After performing the above three steps, the program will no longer appear firewall configuration problems after restarting.

iptabel

If it is the configuration of the firewall, we also need to close the firewall. If you are familiar with it, you can actually open the corresponding port policy.

  1. Check the firewall status
service iptables status
Copy the code
  1. Stopping the Firewall
service iptables stop
Redirecting to /bin/systemctl stop  iptables.service
Copy the code
  1. Prohibit startup
chkconfig iptables off
Copy the code

If the cluster environment is installed on other systems, disable the firewall based on the corresponding policy.

Selinux

All three machines need to do this

About this enhanced Linux, there are a lot of suggestions on the Internet to shut down, in this search under the relevant information, mainly because there is no one dedicated to maintain the operation of the white list.

We also closed the test environment to facilitate the construction of our cluster. Deployment is performed based on O&M requirements.

  1. Check the current state of SELinux:
getenforce
Copy the code
  1. Change SELinux state (temporary change, invalid after machine restart)
 setenforce 0   # change SELinux to Permissive

  setenforce 1   # change SELinux status to Enforcing
Copy the code
  1. Change SELinuxw to disabled (permanent, remains in effect after reboot)
Run the /etc/selinux/config command to open the selinux file. Run the reboot command to disable selinuxCopy the code

Fixed IP

All three machines need to do this

In the enterprise environment, if it is a real server, not a cloud server, then we need to fix the IP address before using the server, otherwise the unexpected restart of the server will lead to the change of OUR IP address, and the cluster can not start normally. Fixed IP, two execution schemes:

  • The simplest operation is to have a dedicated human router for fixed allocation. Suggest doing so
  • Fixed IP addresses for dedicated network adapters. In most cases, servers have dual network adapters and optical ports. Refer to the following steps for reference only
  1. Viewing a NIC (ifcfg-enp* is a NIC file)
ls /etc/sysconfig/network-scripts/
Copy the code
  1. Configuring the NIC IP Address
vi /etc/sysconfig/network-scripts/ifcfg-enp*
# enable host-only
cd /etc/sysconfig/network-scripts/
cp ifcfg-enp0s3  ifcfg-enp0s8
Copy the code
  1. Example Change the IP address of a nic to a static IP address
    1. Change BOOTPROTO to static
    2. Example Change NAME to ENp0s8
    3. Change the UUID (you can change a value as long as it is different from the original one)
    4. Add an IPADDR for the host to connect to VMS.
    5. Add NETMASK=255.255.255.0 (the amx can be the same as the network segment X.X.X. 255)

service network restart
Copy the code

Configure hosts

All three machines need to do this

Note that when configuring the primary Namenode, you need to comment out the localhost lines; otherwise, the hostname will not be found. Other nodes can exist

vim /etc/hosts

[root@tidb1 network-scripts]# cat /etc/hosts  
#127.0.0.1 localhost localhost. Localdomain localhost4 localhost4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.108.66 tidb1
192.168.108.67 tidb2
192.168.108.68 tidb3
Copy the code

Configuring login Free

As usual, clusters need to communicate with each other over SSH, so you need to set login free. The steps are as follows:

  1. To generate the secret key
ssh-keygen -t rsa -P ' ' -f ~/.ssh/id_rsa
Copy the code
  1. Write authorized_keys
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys`
Copy the code
  1. Notice Permissions
chmod 0600 ~/.ssh/authorized_keys
Copy the code
  1. Copy it to another server
ssh-copy-id root@tidb2
ssh-copy-id root@tidb3
Copy the code
  1. Try to see if you can log in without each other
ssh tidb2
ssh tidb3
ssh tidb1
Copy the code

On each machine, you need to perform operations to achieve log-in free, if there is a log-in free failure

  • Check whether the configured login – free key is correct. It is usually caused by an error in the key.
  • The secret key is not a problem, the file permissions are correct, the third step to modify the permissions.
Directory permission problem resolved. sudo chmod 700 ~ sudo chmod 700 ~/.ssh sudo chmod 600 ~/.ssh/authorized_keysCopy the code

Prepare the software to be installed

Hadoop HA version we need to use ZooKeeper to achieve, so you need to prepare three software hadoop, ZooKeeper, JDK1.8 version.

1. Create storage locations for the three software. Mkdir zookeeper mkdir hadoop mkdir Java 2. Download software to move to the corresponding directory Wget http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz wget download software The JDK to oracle web site http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-3.1.1/hadoop-3.1.1-src.tar.gz to download and upload our server directory. Gz tar -zxvf hadoop-3.1.1-src.tar.gz tar -zxvf jdk1.8.tar.gzCopy the code

The installation

With the basics configured, we can start installing the program. It is configured on the first machine and then sent synchronously through rsync

Rsync installation

Centos installation, each need to be installed

RPM - qa | grep rsync to check whether the installation without oh rsync yum install - y rsync use yum to install rsyncCopy the code

Installation of the Java environment

  1. The path to the decompressed Java file is displayed
cd/ home/bigdata/Java/jdk1.8pwdFinding path InformationCopy the code
  1. Configuring environment Variables
JAVA_HOME = / home/bigdata/Java/jdk1.8 JRE_HOME = / home/bigdata/Java/jdk1.8 / jre CLASS_PATH =. :$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export JAVA_HOME JRE_HOME CLASS_PATH PATH
Copy the code
  1. Synchronize to other servers
tidb2:
rsync -avup /etc/profile root@tidb2:/etc/
tidb3:
rsync -avup /etc/profile root@tidb3:/etc/
Copy the code
  1. Enforce the
All three are executedsource/etc/profile If the file is used by an individual user, run this commandsource ~/.bashrc
 
Copy the code

They are installed

  1. Go to the decompressed directory above
cd/ home/bigdata/zookeeper/zookeeper - 3.4.13Copy the code

  1. Add the zoo. CFG file
cd/ home/bigdata/they/they are - 3.4.13 / conf. Cp zoo_sample CFG zoo. The CFG zoo_sample mv. CFG bak_zoo_sample. CFG backup filesCopy the code
  1. Edit zoo. CFG

Modified to create dataDir file before the mkdir -p/home/bigdata/they/they are – 3.4.13 / TMP on the basis of the original content increase content server configuration and configuration dataDir file content


# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
The dataDir file path must be created in advance
# example sakes.DataDir = / home/bigdata/they/they are - 3.4.13 / TMP# the port at which the clients will connect
clientPort=2181
Note the number of the server, we will need to use in the future
server.1=tidb1:2888:3888
server.2=tidb2:2888:3888
server.3=tidb3:2888:3888
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purg
Copy the code

The basic configuration information is as follows:

  • TickTime: basic heartbeat time unit, in milliseconds. ZK Almost all times are integer multiples of this time.
  • InitLimit: number of tickTime, which indicates the time required for the followers to synchronize with the leader after the leader election. If the followers are too many or the leader data is too low, the synchronization time may increase correspondingly, and this value should be increased accordingly. Of course, this value is also the maximum wait time (setSoTimeout) for the followers and observers to start synchronizing the leader’s data.
  • syncLimit : Number of tickTime, which is easily confused with the above time. It also indicates the maximum waiting time of the interaction between the followers and observers and the leader. It is only the timeout time of the normal request forwarding or ping interaction after the synchronization with the leader is completed.
  • DataDir: specifies the address for storing in-memory database snapshots. If the address for storing transaction logs (dataLogDir) is not specified, the directory is also stored in this directory by default. You are advised to store the two addresses on different devices
  • ClientPort: configures the port on which ZK listens for client connectionsclientPort=2181
Server. The serverid = host: tickpot: electionport fixed writing server: fixed writing serverid: each server specified ID (must be between 1-255, must be every machine can't repeat) host: Host name tickpot: heartbeat communication port electionport: electionportCopy the code
  1. Create the required folders
The mkdir -p/home/bigdata/they/they are - 3.4.13 / TMPecho1 > / home/bigdata/they/they are - 3.4.13 / TMP/myidCopy the code
  1. Synchronize to another server
rsync -avup  /home/bigdata/zookeeper root@tibd2:/home/bigdata/ 
rsync -avup  /home/bigdata/zookeeper root@tibd3:/home/bigdata/ 
Copy the code
  1. Change myIDS on other servers
Tidb2: vim/home/bigdata/they/they are - 3.4.13 / TMP/myid tidb3 1 to 2: Vim/home/bigdata/they/they are - 3.4.13 / TMP/myid will be 1 to 3Copy the code
  1. Configuring environment Variables
JAVA_HOME = / home/bigdata/Java/jdk1.8 JRE_HOME = / home/bigdata/Java/jdk1.8 / jre CLASS_PATH =. :$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/ lib ZOOKEEPER_HOME = / home/bigdata/they/they are - 3.4.13 PATH =$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$ZOOKEEPER_HOME/bin
export JAVA_HOME JRE_HOME CLASS_PATH PATH ZOOKEEPER_HOME

source /etc/profile
Copy the code
  1. Verify that it is normal. After the preceding seven steps are completed, verify zooKeeper and start Zookpeer on each server
Small blackboard, all three need to be executedcd/ home/bigdata/they/they are - 3.4.13 / bin. / zkServer sh start 1: check whether the startup mode Use the JPS command to check whether the execution is successful. If the JPS does not exist, run the installation command to check whether the configuration existsexport PATH=$PATH:/usr/java/jdk1.8/bin 85286 QuorumPeerMain Method 2:./ zkserver. sh status [root@tidb1 bin]# ./zkServer.sh statusZooKeeper JMX enabled by default Using the config: / home/bigdata/they/they are - 3.4.13 / bin /.. /conf/zoo. CFG Mode: follower indicates the slave node. Mode: Leader indicates the active nodeCopy the code

The installation of the Hadoop

Creation of the base file

During the installation process we need some directory files to store our data, log files, data storage files are stored in different directories, need to be prepared in advanceCopy the code
  • Data is stored
Mkdir -p /media/data1/ HDFS /data mkdir -p /media/data2/ HDFS /data mkdir -p /media/data3/ HDFS /dataCopy the code
  • Journal Content Store
mkdir -p /media/data1/hdfs/hdfsjournal
Copy the code
  • Namenode Content storage path
mkdir -p /media/data1/hdfs/name
Copy the code

Modify related configuration files

  • Configuring the Java Environment
Edit the hadoop in hadoop - env. Sh file vim/home/bigdata/hadoop/hadoop/etc/hadoop/hadoop - env. Sh JDK configuration environment, here can also configure the JVM memory size, etcexportJAVA_HOME = / home/bigdata/Java/jdk1.8#export HADOOP_NAMENODE_OPTS=" -Xms1024m -Xmx1024m -XX:+UseParallelGC"
#export HADOOP_DATANODE_OPTS=" -Xms512m -Xmx512m"
HADOOP_LOG_DIR=/opt/data/logs/ Hadoop configuration log file
Copy the code
  • Configure the core – site. The XMLvim /home/bigdata/hadoop/hadoop/etc/hadoop/core-site.xml
<? xml version="1.0" encoding="UTF-8"? > <? xml-stylesheettype="text/xsl" href="configuration.xsl"? > <! Licensed under the Apache License, Version 2.0 (the"License");
  you may not use this file except inThe compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed toin writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License forthe specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <! -- Put site-specific property overridesinthis file. --> <configuration> <! -- Specify HDFS nameservice, default link address, --> <property> <name>fs.defaultFS</name> <value> HDFS ://cluster</value> </property> <! -- Temporary file storage directory --> <property> <name>hadoop.tmp. Dir </name> <value>/media/data1/ HDFSTMP </value> </property> <! -- Specify zookeeper, You can also set the timeout time more, etc - > < property > < name > ha. Zookeeper. Quorum < / name > < value > tidb1:2181, tidb2:2181, tidb3:2181 value > < / </property> </configuration>Copy the code
  • Configuration HDFS – site. XMLvim /home/bigdata/hadoop/hadoop/etc/hadoop/hdfs-site.xml
<? xml version="1.0" encoding="UTF-8"? > <? xml-stylesheettype="text/xsl" href="configuration.xsl"? > <! Licensed under the Apache License, Version 2.0 (the"License");
  you may not use this file except inThe compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed toin writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License forthe specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <! -- Put site-specific property overridesinthis file. --> <configuration> <! The name of the configured nameservice --> <property> <name>dfs.nameservices</name> <value>cluster</value> </property> <! --> <property> <name>dfs.permissions. Enabled </name>false< / varsync - avup hadoop - 3.1.1 root @ tidb2: / home/bigdata/hadoop/lue > < / property > <! Cluster </name> <value>nn1,nn2</value> </property> <! -- Configure the address and port of the namenode --> <property> <name> dfs.namenorsync-avup hadoop-3.1.1 root@tidb2:/home/bigdata/hadoop/de.rpc-address.cluster.nn1</name> <value>tidb1:9000</value> </property> <property> <name>dfs.namenode.rpc-address.cluster.nn2</name> <value>tidb2:9000</value> </property> <property> <name>dfs.namenode.http-address.cluster.nn1</name> <value>tidb1:50070</value> </property> <property> <name>dfs.namenode.http-address.cluster.nn2</name> <value>tidb2:50070</value> </property> <! -- journal namenode Synchronizes the shared storage location of namenode metadata. Is the list of journal information - > < property > < name > DFS. The namenode. Shared. The edits. Dir < / name > < value > qjournal: / / tidb1:8485; tidb2:8485; Tidb3:8485 / cluster < value > / < / property > rsync - avup hadoop - 3.1.1 root @ tidb2: / home/bigdata/hadoop / <! Configure the content of the HA solution. After failing to switch the way -- -- > < property > < name > DFS. Client. Failover. Proxy. Provider. Cluster < / name > <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <! -- SSH scheme configuration --> <property> <name> DFs.ha.fencing. Methods </name> <value>sshfence</value> </property> hdFs-site.xml <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <! - journalnode save the file path - > < property > < name >. DFS journalnode. Edits. Dir < / name > < value > / media/data1 / HDFS/hdfsjournal value > < / </property> <! -- <property> <name>dfs.ha.automatic-failover. Enabled </name> <value>true</value> </property> <! -- Namenode file path information --> <property> <name>dfs.namenode.name.dir</name> <value>/media/data1/ HDFS /name> </property> <! -- Datanode data storage path, --> <property><property> <name>dfs.namenode.name.dir</name> <value>/media/data1/ HDFS /name</value> </property> <name>dfs.datanode.data.dir</name> <value>/media/data1/hdfs/data, /media/data2/hdfs/data, /media/data3/hdfs/data </value> </property> <! Replication </name> <value>3</value> </property> <! --> <property> <name>dfs.webhdfs. Enabled </name> <value>true</value> </property> <name> DFs.journalNode. http-address</name> <value>0.0.0.0:8480</value> </property> <property> <name> dfs.journalNode. rpc-address</name> <value>0.0.0.0:8485</value> </property> <! Zookeeper - configuration - > < property > < name > ha. The zookeeper. Quorum < / name > < value > tidb1:2181, tidb2:2181, tidb3:2181 value > < / </property> </configuration>Copy the code
  • Modify the mapred – site. XMLvim /home/bigdata/hadoop/hadoop/etc/hadoop/mapred-site.xml
<? xml version="1.0"? > <? xml-stylesheettype="text/xsl" href="configuration.xsl"? > <! Licensed under the Apache License, Version 2.0 (the"License");
  you may not use this file except inThe compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed toin writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License forthe specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <! -- Put site-specific property overridesinthis file. --> <configuration> <! Mapreduce.framework. name</name> <value> YARN </value> </property> <! - specifies graphs jobhistory address - > < property > < name > graphs. The jobhistory. Address < / name > < value > tidb1:10020 < / value > </property> <! - task history server web address - > < property > < name > graphs. The jobhistory. Webapp. Address < / name > < value > tidb1:19888 < / value > rsync - avup Hadoop - 3.1.1 root @ tidb2: / home/bigdata/hadoop / < / property > < / configuration >Copy the code
  • yarn-site.xmlvim /home/bigdata/hadoop/hadoop/etc/hadoop/yarn-site.xm
<? xml version="1.0"? > <! Licensed under the Apache License, Version 2.0 (the"License");
  you may not use this file except inThe compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed toinwriting, Softwarersync - avup hadoop - 3.1.1 root @ tidb2: / home/bigdata/hadoop distributed under the License is distributed on the an"AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License forthe specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <! -- Site specific YARN configuration properties --> <configuration> <! -- Configure the HA ID of namenode. Don't configuration - > < property > < name > yarn. The resourcemanager. Ha. Id < / name > < value > rm1 < value > / < / property > < property > <name> yarn.nodeManager. aux-services</name> <value>mapreduce_shuffle</value> rsync-avup hadoop-3.1.1 root@tidb2:/home/bigdata/hadoop/ </property> <! -- Site specific YARN configuration properties --> <! -- Enable Resourcemanager HA --> <! - whether to open the RM ha, default is open - > < property > < name > yarn. The resourcemanager. Ha. Enabled < / name > < value >true</value> </property> <! - statement two resourcemanager address - > < property > < name > yarn. The resourcemanager. Cluster - id < / name > < value > rmcluster < value > / < / property > <! - the name of the rm - > < property > < name > yarn. The resourcemanager. Ha. The rm - ids < / name > < value > rm1, rm2 < value > / < / property > <! - specifies the address of the rm - > < property > < name > yarn. The resourcemanager. The hostname. Rm1 < / name > < value > tidb1 < value > / < / property > < property > <name>yarn.resourcemanager.hostname.rm2</name> <value>tidb2</value> </property> <! - specify the address of zookeeper cluster - > rsync - avup hadoop - 3.1.1 root @ tidb2: / home/bigdata/hadoop / < property > <name>yarn.resourcemanager.zk-address</name> <value>tidb1:2181,tidb2:2181,tidb3:2181</value> </property> <! -- Enable automatic recovery. When the RM fails in the middle of a task, automatic recovery is enabledfalse-->   
    <property>  
       <name>yarn.resourcemanager.recovery.enabled</name>  
       <value>true</value> </property> <! - Specifies that resourcemanager status information is stored in the ZooKeeper cluster. By default, the resourcemanager status information is stored in FileSystem. --> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> </configuration>Copy the code
  • workers/home/bigdata/hadoop/hadoop/etc/hadoop/workers
If namenode and Datanode are separated, namenode nodes will not be added to workers.
# No separate then need to join
tidb1
tidb2
tidb3
Copy the code
  • start-dfs.sh stop-dfs.shvim /home/bigdata/hadoop/hadoop/sbin/start-dfs.sh vim /home/bigdata/hadoop/hadoop/sbin/stop-dfs.sh
HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER= HDFS HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root HDFS_JOURNALNODE_USER=root HDFS_ZKFC_USER=rootCopy the code
  • start-yarn.sh stop-yarn.shvim /home/bigdata/hadoop/hadoop/sbin/start-yarn.sh vim /home/bigdata/hadoop/hadoop/sbin/stop-yarn.sh
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
Copy the code
  • Sync to other machines via rsync
Rsync - avup hadoop - 3.1.1 root @ tidb2: / home/bigdata/hadoop/rsync - avup hadoop - 3.1.1 root @ tidb3: / home/bigdata/hadoop / Note the following if you want to configure namenode numbers after synchronization: Change the ID numbers on namenode and delete the id numbers on Datanode.Copy the code

Start the

With all the above files ready, we are ready to launch.

Zookeeper->JournalNode-> Format NameNode-> Create namespace ZKFS ->NameNode->Datanode->ResourceManager->NodeManager

Start the zookeeper

Switch to the zooKeeper directory for each SERVER./ zkserver. sh startCopy the code

Start the journalnode

Go to the hadoop installation directory and go to the sbin directory./hadoop-daemon.sh start journalNode Start journalNodeCopy the code

Format the namenode

  1. formatting
hadoop namenode -format
Copy the code
  1. Format and synchronize content to other nodes. This must be done otherwise other Namenodes will not start
Synchronized content: XML <property> <name>dfs.namenode.name.dir</name> <value>/media/data1/ HDFS /name</value> </property> To synchronize the rsync - avup current root @ tidb2: / media/data1 / HDFS/name/rsync - avup current root @ tidb3: / media/data1 / HDFS/name /Copy the code

Formatting ZKFC

Small blackboard: Namdenode1 can only be formatted on namenode

./hdfs zkfs -formatZK
Copy the code

Close the journalnode

./hadoop-daemon.sh stop journalnode
Copy the code

Starting a Hadoop Cluster

Run the./start-all.sh command to start the sbin directory in the Hadoop directory.Copy the code

Viewing startup Status

  • Using commands
[root@tidb1 bin]# ./hdfs haadmin -getServiceState nn1
standby
[root@tidb1 bin]# ./hdfs haadmin -getServiceState nn2
active

Copy the code
  • Interface to view
http://192.168.108.66:50070   # Note that this port is a custom port, not a default port
http://192.168.108.67:50070
Copy the code

Denver annual essay | 2018 technical way with me The campaign is under way…