0 / introduction

Set up the Java environment set up the ZooKeeper environment Set up the Storm environmentCopy the code

1/ Machine planning

Machine planning is how many servers you plan to build in a cluster, which are the master nodes and which are the slave nodes. This is what machine planning means, as shown in the diagram below.Copy the code

2/ Build the JAVA environment

Since storm stream processing is a Java process, the Java environment needs to be set up firstCopy the code

< 1 > download the JDK

Can to Oracle's website to download the Linux version of the JDK, link is: http://www.oracle.com/technetwork/java/javase/downloads/index.html I download jdk1.7.0 _72Copy the code

<2> Upload to the primary node of the cluster, and unzip the JDK

Upload the rz command to the active node. On the CLI, run tar -zxvf jdK-7u72-linux-x64.tar. gz to decompress the commandCopy the code

<3> Move the JDK directory to /usr/local

 mv jdk-7u72-linux-x64 /usr/local
Copy the code

<4> Configure environment variables

Enter vim /etc/profile on the command line to open the profile file and add the following code to the end of the file:  JAVA_HOME=/usr/local/jdk-7u72-linux-x64 CLASS_PATH=.:$JAVA_HOME/lib PATH=$JAVA_HOME/bin:$PATH export JAVA_HOME CLASS_PATH PATH Enter the source /etc/profile command for the environment variables to take effect immediatelyCopy the code

<5> Copy the file

On the command line, type the following command to copy the JDK environment and /etc/profile file to all other subordinate nodes. scp -r /usr/local/jdk-7u72-linux-x64 slave1@ip:/usr/local scp -r /usr/local/jdk-7u72-linux-x64 slave2@ip:/usr/local scp /etc/profile slave1@ip: /etc/scp /etc/profile slave2@ip: /etc/profile We run the source /etc/profile command on slave1 and slave2 respectively to make the environment variables take effect.Copy the code

3/ Set up the Zookeeper cluster environment

< 1 > download the Zookeeper

The Apache's official website to download Zookeeper, link is: http://www.apache.org/dyn/closer.cgi/zookeeper/ I download is a Zookeeper - 3.4.9Copy the code

<2> Upload it to the active node and decompress Zookeeper

Rz Command upload On the CLI, run tar -zxvf zookeeper-3.4.9.tar.gz to decompress ZooKeeper.Copy the code

<3> The Zookeeper cluster is set up

CFG cp zoo_sample. CFG vi zoo. CFG Run the vi zoo. CFG command to configure the configuration as follows:  # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, / TMP here is just # example sakes. dataDir=/usr/local/zookeeper-3.4.9/data DataLogDir =/usr/local/zookeeper-3.4.9/datalog # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature PurgeInterval =1 server.1= liuYazhuang161:2888:3888 Server.2= Liuyazhuang162:2888:3888 Server.3= liuYazhuang163:2888:3888 This directory needs to be filled in because dataDir and dataLogDir are added in the above configuration file, so you need to create these two directories. DataDir is to store the myID file. DataLogDir is where log files are stored. Create data and dataLog directories in the Zookeeper directory to store Zookeeper myID files and log files. Mkdir data mkdir dataLog Switch the directory to the Zookeeper data directory and create file myID. I'm going to say 1, and I'm going to save it.Copy the code

<4> Configure Zookeeper environment variables

In order to facilitate operation, we also configured Zookeeper into the environment variable, plus the JDK previously configured, our profile configuration is as follows: JAVA_HOME = / usr/local/jdk1.7.0 _72 CLASS_PATH =. : $JAVA_HOME/lib ZOOKEEPER_HOME = / usr/local/zookeeper - 3.4.9 PATH=$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH export JAVA_HOME HADOOP_HOME CLASS_PATH PATHCopy the code

<5> Copy the file

Copy the Zookeeper environment and profile files to liuYazhuang162 and LiuYazhuang163 hosts respectively. Run the following command: SCP -r /usr/local/zookeeper-3.4.9 slave1@ip:/usr/local SCP -r /usr/local/zookeeper-3.4.9 slave2@ip:/usr/local SCP /ect/profile slave1@ip: /etc/scp/ect /profile slave2@ip:/etc/ Meanwhile, run the source /etc/profile command on slave1 and Slave2 respectively to make the environment variables take effect.Copy the code

<6> Modify myID files of other hosts

Do not forget to change the myID file of Zookeeper on Slave1 to 2. Change the myID file of Zookeeper on Slave2 to 3Copy the code

4/ Set up Storm cluster environment

< 1 > download Storm

Apache is the official download Storm link: http://storm.apache.org/downloads.html, I'm here to download the version of Apache - Storm - 1.1.1. Tar. GzCopy the code

<2> Upload to the primary node, unpack Storm, and move to /usr/local

Rz command: tar -zxvf apache-storm-1.1.1.tar.gz mv apache-storm 1.1.1 /usr/localCopy the code

< 3 > modify storm. Yaml

Yaml: storm. Zookeeper. servers: - "192.168.209.161" - "192.168.209.162" - "192.168.209.163" storm. Zookeeper. port: 2181 storm. "/ usr/local/apache - storm - 1.1.1 / data" nimbus. Seeds: [" 192.168.209.161 "] supervisor. Slots. The ports: - 6700-6701-6702-6703 Storm. Zookeeper. servers specifies the IP address of the ZooKeeper cluster. If the Zookeeper cluster does not use the default port, you also need to configure the storm.zookeeper.port parameter. Storm.local. dir is used to configure the path where storm stores a small number of files. Nimbus. seeds Specifies the address of the main control node.Copy the code

< 4 > copy Storm

SCP -r /usr/local/apache-storm-1.1.1/ slave1@ip: /usr/local/scp-r /usr/local/apache-storm-1.1.1/ slave2@ip:/usr/local/Copy the code

<5> Configure environment variables and copy them

Vim /etc/profile JAVA_HOME=/usr/local/jdk1.7.0_72 CLASS_PATH=.:$JAVA_HOME/lib STORM_HOME=/usr/local/apache-storm-1.1.1 ZOOKEEPER_HOME = / usr/local/zookeeper - 3.4.9 PATH = $JAVA_HOME/bin: $ZOOKEEPER_HOME/bin: $STORM_HOME/bin: $PATH export JAVA_HOME HADOOP_HOME CLASS_PATH PATH STORM_HOME Sends the environment variable file SCP from the primary node to the secondary node so that it is consistent. Liuyazhuang163 :/etc/ SCP /ect/profile liuyazhuang163:/etc/ We execute the command source /etc/profile on the liuYazhuang162 and liuyazhuang163 hosts respectively to make the environment variables take effectCopy the code

<6> Start Storm service

On 192.168.209.161, run nimbus bin/storm nimbus >/dev/null 2>&1 & on 192.168.209.162 192.168.209.163, run nimbus bin/storm nimbus >/dev/null 2>&1 & Run supervisor bin/storm Supervisor >/dev/null 2> &1&on 192.168.209.161 To run Storm UI bin/ Storm UI >/dev/null 2>&1 & Storm UI must run on nimbus, not SupervisorCopy the code

5/ Access Storm UI

Below is the monitoring page for storm mission executionCopy the code

6/ Problems and solutions

<1>Failed to Sync Supervisor

The storm server version is 1.1.1, and the jar package submitted to the Storm server is storm-starter-1.0.2.jar. Version mismatch causes the above problem. Jar code is too different from storm-core-1.0.2.jar, or the communication protocol is changed. Change storm server version to 1.0.0 to fix the problemCopy the code

<2> There are multiple supervisors but only one is displayed on the Storm UI

The specific phenomenon is that multiple supervisors are started and only one of them is displayed on the UI page (or some of them may appear to be "merged"). When one of them is killed, another one appears. For example, we have two Supervisor 192.168.209.162 and 192.168.209.163, but only one of them is displayed each time we request the supervisor through the interface, but the actual machines alternate: Dir: "/usr/local/apache-storm-1.1.1/data", local.dir: /usr/local/apache-storm-1.1.1/data. The reason is that during deployment, the software was distributed to other machines directly by SCP command, which left the local.dir stuff, and Storm calculated a Supervisor ID based on one or some files in local.dir. After the local.dir is deleted, the ID is generated again.Copy the code

<3>Could not find or load main class org.apache.storm.starter.ExclamationTopology

Bin/storm jar/usr/local/apache - storm - 1.1.1 / storm - starter - 1.0.2. Jar org. Apache. Storm. The starter. ExclamationTopology et first to make sure The storm-starter-1.0.2.jar path is correct; The second guarantee packagename ExclamationTopology, package name packagename ExclamationTopology with class name is correctCopy the code