Storm cluster environment setup
1. Cluster planning
Second, preconditions
3. Cluster building
- Download and unzip
- Configuring environment Variables
- The cluster configuration
- Installation Package Distribution
4. Start the cluster
- Start the ZooKeeper cluster
- Start Storm cluster
- Check the cluster
High availability verification
1. Cluster planning
A 3-node Storm cluster is set up here: Supervisor and LogViewer services are deployed on all three hosts. To ensure high availability, deploy the active Nimbus service on HadoOP001 and standby Nimbus service on Hadoop002. The Nimbus service is coordinated and managed by the Zookeeper cluster. If the active Nimbus is unavailable, the standby Nimbus becomes the new active Nimbus.
Second, preconditions
Storm relies on Java 7+ and Python 2.6.6 + to run, so it needs to be pre-installed. In order to ensure high availability, we use external Zookeeper cluster instead of Storm built-in Zookeeper cluster. As these three software are dependent on multiple frameworks, their installation steps are separately sorted as follows:
- Linux environment JDK installation address: blog.csdn.net/u011493462/…
- Python installation address: Linux blog.csdn.net/u011493462/…
- Zookeeper standalone and cluster environment building address: blog.csdn.net/u011493462/…
3. Cluster building
1. Download and unpack it
Download the installation package and decompress it. The official download address: storm.apache.org/downloads.h…
#Unpack theThe tar ZXVF - apache - storm - 1.2.2. Tar. GzCopy the code
2. Configure environment variables
# vim /etc/profile
Copy the code
Add environment variables:
Export STORM_HOME = / usr/app/apache - storm - 1.2.2 export PATH = $STORM_HOME/bin: $PATHCopy the code
Make configured environment variables take effect:
# source /etc/profile
Copy the code
3. Configure the cluster
${STORM_HOME}/conf/storm.yaml
# Host list of the Zookeeper cluster
storm.zookeeper.servers:
- "hadoop001"
- "hadoop002"
- "hadoop003"
# Nimbus node list
nimbus.seeds: ["hadoop001"."hadoop002"]
Nimbus and Supervisor need to use local disks to store a small amount of state (jar packages, configuration files, etc.)
storm.local.dir: "/home/storm"
# Port of workers process. Each worker process uses a port to receive messages
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
Copy the code
The supervisor. Slots. ports parameter is used to configure the ports for workers processes to receive messages. By default, four workers will start on each Supervisor node. Assume that you only want to start 2 workers, you can configure 2 ports here.
4. Distribute the installation package
Distribute the Storm installation package to other servers. You are advised to configure Storm environment variables on these two servers.
SCP -r /usr/app/apache-storm-1.2.2/ root@hadoop002: /usr/app/scp-r /usr/app/apache-storm-1.2.2/ root@hadoop003:/usr/app/Copy the code
4. Start the cluster
4.1 Starting the ZooKeeper Cluster
Start the ZooKeeper service on three servers.
zkServer.sh start
Copy the code
4.2 Starting the Storm Cluster
Because multiple processes need to be started, the background process is used to start the process. Go to ${STORM_HOME}/bin and run the following command:
Hadoop001 & hadoop002:
#Example Start nimbus on the primary node
nohup sh storm nimbus &
#Start the supervisor of the secondary node
nohup sh storm supervisor &
#Start UI UI
nohup sh storm ui &
#Start log viewing service LogViewer
nohup sh storm logviewer &
Copy the code
Hadoop003:
Only the Supervisor service and logViewer service need to be started on hadoOP003:
#Start the supervisor of the secondary node
nohup sh storm supervisor &
#Start log viewing service LogViewer
nohup sh storm logviewer &
Copy the code
4.3 Viewing Clusters
Using JPS to view the processes, the three server processes should look like this:
To access port 8080 of hadoOP001 or Hadoop002, the interface is as follows. It can be seen that there are two Nimbus and three supervisors, one active and one standby, and each Supervisor has four slots, that is, four available worker processes. In this case, the cluster has been set up successfully.
High availability verification
Run the kill command on Hadoop001 to kill Nimbus threads. The Nimbus on Hadoop001 is in offline state. Nimbus on Hadoop002 becomes the new Leader.
More dry goods, pay attention to the public number: data ape Wen Dada