1. Cluster planning

Set up a three-node HBase cluster, where Regin Server is deployed on three hosts. To ensure high availability, deploy the active Master service on Hadoop001 and the standby Master service on Hadoop002. The Zookeeper cluster coordinates and manages the Master service. If the active Master is unavailable, the standby Master becomes the new active Master.

Second, preconditions

HBase depends on Hadoop and JDK(HBase 2.0+ corresponds to JDK 1.8+). To ensure high availability, we use an external Zookeeper cluster instead of the HBase built-in Zookeeper service. For setup steps, see:

  • JDK installation in Linux
  • This section describes how to create a single-node and cluster Environment for Zookeeper
  • Hadoop cluster environment construction

3. Cluster building

3.1 Download and Decompress the file

Download and unpack, here I download CDH version HBase, download address is: archive.cloudera.com/cdh5/cdh/5/

#The tar - ZXVF hbase - 1.2.0 - cdh5.15.2. Tar. Gz
Copy the code

3.2 Configuring Environment Variables

# vim /etc/profile
Copy the code

Add environment variables:

Export HBASE_HOME = usr/app/hbase - 1.2.0 - cdh5.15.2 export PATH = $HBASE_HOME/bin: $PATHCopy the code

Make configured environment variables take effect immediately:

# source /etc/profile
Copy the code

3.3 Cluster Configuration

Go to ${HBASE_HOME}/conf and modify the configuration:

1. hbase-env.sh

#Configure the JDK installation locationExport JAVA_HOME = / usr/Java/jdk1.8.0 _201#The built-in ZooKeeper service is not used
export HBASE_MANAGES_ZK=false
Copy the code

2. hbase-site.xml

<configuration>
    <property>
        <! -- Set hbase to run in distributed cluster mode -->
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <property>
        <! -- Specify the storage location of hbase in HDFS -->
        <name>hbase.rootdir</name>
        <value>hdfs://hadoop001:8020/hbase</value>
    </property>
    <property>
        <! -- Set zooKeeper address -->
        <name>hbase.zookeeper.quorum</name>
        <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
    </property>
</configuration>
Copy the code

3. regionservers

hadoop001
hadoop002
hadoop003
Copy the code

4. backup-masters

hadoop002
Copy the code

The backup-masters file does not exist and needs to be created. It is mainly used to specify the standby master node. It can be multiple.

3.4 Configuring the HDFS Client

Here is an optional configuration: If you make a change to HDFS client configuration on a Hadoop cluster, such as setting the replica coefficient dfs.replication to 5, you must use one of the following methods to let HBase know, otherwise HBase will still use the default replica coefficient 3 to create files:

  1. Add a pointer to your HADOOP_CONF_DIR to the HBASE_CLASSPATH environment variable in hbase-env.sh.
  2. Add a copy of hdfs-site.xml (or hadoop-site.xml) or, better, symlinks, under ${HBASE_HOME}/conf, or
  3. if only a small set of HDFS client configurations, add them to hbase-site.xml.

The above is the explanation of the official document, here is the explanation:

Type 1: Add the location of the Hadoop configuration file to the HBASE_CLASSPATH attribute of hbase-env.sh, as shown in the following example:

Export HBASE_CLASSPATH = usr/app/hadoop - server - cdh5.15.2 / etc/hadoopCopy the code

XML or hadoop-site. XML of Hadoop to ${HBASE_HOME}/conf or through symbolic links. In this case, it is recommended to copy both or create symbolic links, as shown in the following example:

#copyCp core - site. XML HDFS - site. XML/usr/app/hbase - 1.2.0 - cdh5.15.2 / conf /#Using symbolic linksLn -s/usr/app/hadoop - server - cdh5.15.2 / etc/hadoop/core - site. XML ln -s The/usr/app/hadoop - server - cdh5.15.2 / etc/hadoop/HDFS - site. XMLCopy the code

Note: The hadoop-site. XML configuration file is now called core-site. XML

Third: If you have only a few changes, go directly to hbase-site.xml.

3.5 Distributing the Installation Package

After distributing the HBase installation package to other servers, you are advised to configure HBase environment variables on the two servers.

SCP -r /usr/app/hbase-1.2.0-cdh5.15.2/ hadoOP002 :usr/app/ SCP -r /usr/app/hbase-1.2.0-cdh5.15.2/ hadoOP003 :usr/app/Copy the code

4. Start the cluster

4.1 Starting the ZooKeeper Cluster

Start the ZooKeeper service on three servers.

 zkServer.sh start
Copy the code

4.2 Starting a Hadoop Cluster

#Start the DFS service
start-dfs.sh
#Starting the YARN Service
start-yarn.sh
Copy the code

4.3 Starting an HBase Cluster

Go to ${HBASE_HOME}/bin in hadoOP001 and run the following command to start the HBase cluster. After this command is executed, the Master service is started on hadoop001, the standby Master service is started on Hadoop002, and the Region Server service is started on all nodes configured in the RegionServers file.

start-hbase.sh
Copy the code

4.5 Viewing Services

Access the Web-Based UI of HBase. I have installed HBase version 1.2 and access port 60010. If you have installed HBase version 2.0 or later, access port 16010. You can see that the Master is on Hadoop001, the three Regin Servers are on Hadoop001, Hadoop002, and hadoop003, and there is also a Backup Matser service on Hadoop002.


HBase on Hadoop002 is in standby state:


See the GitHub Open Source Project: Getting Started with Big Data for more articles in the big Data series