Storm cluster environment setup

1. Cluster planning

Second, preconditions

3. Cluster building

  1. Download and unzip
  2. Configuring environment Variables
  3. The cluster configuration
  4. Installation Package Distribution

4. Start the cluster

  1. Start the ZooKeeper cluster
  2. Start Storm cluster
  3. Check the cluster

High availability verification

1. Cluster planning

A 3-node Storm cluster is set up here: Supervisor and LogViewer services are deployed on all three hosts. To ensure high availability, deploy the active Nimbus service on HadoOP001 and standby Nimbus service on Hadoop002. The Nimbus service is coordinated and managed by the Zookeeper cluster. If the active Nimbus is unavailable, the standby Nimbus becomes the new active Nimbus.

Second, preconditions

Storm relies on Java 7+ and Python 2.6.6 + to run, so it needs to be pre-installed. In order to ensure high availability, we use external Zookeeper cluster instead of Storm built-in Zookeeper cluster. As these three software are dependent on multiple frameworks, their installation steps are separately sorted as follows:

  • Linux environment JDK installation address: blog.csdn.net/u011493462/…
  • Python installation address: Linux blog.csdn.net/u011493462/…
  • Zookeeper standalone and cluster environment building address: blog.csdn.net/u011493462/…

3. Cluster building

1. Download and unpack it

Download the installation package and decompress it. The official download address: storm.apache.org/downloads.h…

#Unpack theThe tar ZXVF - apache - storm - 1.2.2. Tar. GzCopy the code

2. Configure environment variables

# vim /etc/profile
Copy the code

Add environment variables:

Export STORM_HOME = / usr/app/apache - storm - 1.2.2 export PATH = $STORM_HOME/bin: $PATHCopy the code

Make configured environment variables take effect:

# source /etc/profile
Copy the code

3. Configure the cluster

${STORM_HOME}/conf/storm.yaml

# Host list of the Zookeeper cluster
storm.zookeeper.servers:
     - "hadoop001"
     - "hadoop002"
     - "hadoop003"

# Nimbus node list
nimbus.seeds: ["hadoop001"."hadoop002"]

Nimbus and Supervisor need to use local disks to store a small amount of state (jar packages, configuration files, etc.)
storm.local.dir: "/home/storm"

# Port of workers process. Each worker process uses a port to receive messages
supervisor.slots.ports:
     - 6700
     - 6701
     - 6702
     - 6703
Copy the code

The supervisor. Slots. ports parameter is used to configure the ports for workers processes to receive messages. By default, four workers will start on each Supervisor node. Assume that you only want to start 2 workers, you can configure 2 ports here.

4. Distribute the installation package

Distribute the Storm installation package to other servers. You are advised to configure Storm environment variables on these two servers.

SCP -r /usr/app/apache-storm-1.2.2/ root@hadoop002: /usr/app/scp-r /usr/app/apache-storm-1.2.2/ root@hadoop003:/usr/app/Copy the code

4. Start the cluster

4.1 Starting the ZooKeeper Cluster

Start the ZooKeeper service on three servers.

 zkServer.sh start
Copy the code

4.2 Starting the Storm Cluster

Because multiple processes need to be started, the background process is used to start the process. Go to ${STORM_HOME}/bin and run the following command:

Hadoop001 & hadoop002:

#Example Start nimbus on the primary node
nohup sh storm nimbus &
#Start the supervisor of the secondary node 
nohup sh storm supervisor &
#Start UI UI  
nohup sh storm ui &
#Start log viewing service LogViewer 
nohup sh storm logviewer &
Copy the code

Hadoop003:

Only the Supervisor service and logViewer service need to be started on hadoOP003:

#Start the supervisor of the secondary node 
nohup sh storm supervisor &
#Start log viewing service LogViewer 
nohup sh storm logviewer &
Copy the code

4.3 Viewing Clusters

Using JPS to view the processes, the three server processes should look like this:


To access port 8080 of hadoOP001 or Hadoop002, the interface is as follows. It can be seen that there are two Nimbus and three supervisors, one active and one standby, and each Supervisor has four slots, that is, four available worker processes. In this case, the cluster has been set up successfully.

High availability verification

Run the kill command on Hadoop001 to kill Nimbus threads. The Nimbus on Hadoop001 is in offline state. Nimbus on Hadoop002 becomes the new Leader.

More dry goods, pay attention to the public number: data ape Wen Dada