Configuring single node Storm Cluster.

The following steps are for Ubuntu 12.04 LT or 14.05 LT. Storm has lot of moving parts as of now and the easiest configuration happens on Ubuntu. I tried configuring on CentOS and found quite challenging. Before trying storm configuration on CentOS I suggest you first try on Ubuntu. ——————————- PRE Requisites—————————-

  1. Make sure your Ubuntu is updated. You can update it using $ sudo apt-get update
  2. Install your favorite JDK. For example : sudo apt-get install openjdk-6-jdk

Image ————————– Other required tools —————————–

$ sudo apt-get install git -y

$ sudo apt-get install libtool -y

$ sudo apt-get install automake -y

$ sudo apt-get install uuid-dev

$ sudo apt-get install g++ -y

$ sudo apt-get install gcc-multilib -y

————————- Zookeeper————————-

ZooKeeper provides a service for maintaining centralized information in a distributed environment using a small set of primitives and group services. Storm uses ZooKeeper primarily to coordinate state information such as task assignments, worker status, and topology metrics between nimbus and supervisors in a cluster. 1) Get the Zookeeper Download the zookeeper setup ( latest at the time of writing is : 3.4.6 ). You can download from browser or with wget. wget http://www.eng.lsu.edu/mirrors/apache/zookeeper/stable/zookeeper-3.4.6.tar.gz Image 2)      Extract the tarball. Image 3) Rename the Zookeeper extracted directory : $ mv zookeeper-3.4.6 zookeeper Image   4) Optionally : a)  Add ZOOKEEPER_HOME under .bashrc b)  Add ZOOKEEPER_HOME/bin to the PATH variable Image   5)  Create a data directory on your favorite place. $ mkdir zookeeper-data/ Image 6) Create a configuration file under ZOOKEEPER_HOME/conf/ directory say zoo.cfg Image 7) Add ticktime, dataDir and clientPort properties in the zoo.cfg file. Image 8) Verify that you are able to start the zookeeper server  : after starting using : $ zkServer.sh start. Image   ———————- Zero MQ ———————————- Storm internally uses ZerMQ, in the current version it is to be installed explicitly, in the future releases they are planning to include this dependency as a part of the storm distribution. 1) Get the ZeroMQ $ wget http://download.zeromq.org/zeromq-2.1.7.tar.gz Image 2) Untar the tarball. $ tar –xvf zeromq-2.1.7.tar.gz Image ————- Configuring ZeroMQ——————- 1) $ cd zeromq-2.1.7 2) $ ./configure   Image   Image 3) $ make ZeroMQ5 4) $ sudo make install ZeroMQ6     ————————– Java Bindings for ZeroMQ  —————————– 1) Get the java binding for ZeroMQ $ git clone https://github.com/nathanmarz/jzmq.git This will create a folder with name jzmq jzmq1  Configurating jzmq: 1) $ cd jzmq 2)  $ sed -i ‘s/classdist_noinst.stamp/classnoinst.stamp/g’ src/Makefile.am 3) $ ./autogen.sh jzmq2 4) $ ./configure jzmq3 5) $ make jzmq4 6) $ sudo make install jzmq5       ————————————– Configuring Storm ———————————— 1) Download storm binaries $ wget http://mirror.tcpdiag.net/apache/incubator/storm/apache-storm-0.9.1-incubating/apache-storm-0.9.1-incubating.tar.gz 2) Untar the tarball $ tar -xvf apache-storm-0.9.1-incubating.tar.gz Storm1 2) Rename the extracted directory to Storm $ mv apache-storm-0.9.1-incubating storm   Storm2 3) Optionally : Add STORM_HOME in .bashrc file. Add STORM_HOME/bin to the PATH. Storm3   4) Add a data directory for storm to store the temporary data and topology jars. I am creating under $STORM_HOME $ cd $STORM_HOME $ mkdir data 5) Edit $STORM_HOME/conf/storm.yaml file. Storm4 6) Edit the values of various parameters like following :  This is very important. And this is the major part of Storm configuration.  Storm5 ————- Start the demeans and verify that installation is successful.———————   Note : If you have not added Storm_Home/bin to the path then you will require to go to STORM_HOME/bin directory and issue the commands on terminal…

 

1) Start Nimbus :
$ storm nimbus

x1

 

2) Start Supervisor ( Do not close previous terminal Open another terminal window and type following )

x2

3) Start UI ( Open a new terminal, change the directory to storm and start UI . Don’t close the previous terminal)

x3

4) Check the UI ( Hit the URL with IP Address of the UI Port defined in storm.yaml file)

x4

 

Troubleshooting Notes : 

Most commonly faced issues are two :

1) Exception in thread “main” java.lang.RuntimeException: org.apache.thrift7.transport.TransportException: 

java.net.ConnectException: Connection refused 

2) Nimbus starts , but stops after few seconds.

The problem could be one of the following

  1. 1. Nimbus is not started correctly. Or Nimbus stopped some time after starting. Check the nimbus logs for errors.
  2. Nimbus or zookeeper is not correctly configured. Please check the storm.yaml file.
  3. dataDir defined in the storm.yaml does not exist or has permission issues.
  4. storm.local.dir defined in the storm.yaml, does not exist or it has permission issues.
  5. There could be connectivity issues between the machines. Please check your network settings.

Happy STORMing!

Advertisements

Tagged:

6 thoughts on “Configuring single node Storm Cluster.

  1. Vinay April 6, 2015 at 8:49 am Reply

    Awesome configuration tutorial 🙂 however i have some updates on the configuration, while configuring jzmq i had to manually change the configuration file under /jzmq/ to add JAVA_HOME and it worked 🙂

  2. Ajay September 17, 2015 at 1:02 am Reply

    Great tutorial.
    Also I observed that supervisor goes down after some time and then i have to start again manually.Have you faced same issue ?

    • arpanr September 17, 2015 at 1:06 am Reply

      Thanks Ajay.

      I didn’t face the issue of supervisor going down. Did you check the logs? Is memory on the system enough?

  3. Ajay September 17, 2015 at 1:54 pm Reply

    Thanks Arpan for your quickly reply.

    But this same problem facing by other people as well.

    [
    http://www.tanzirmusabbir.com/2013/02/setup-storm-cluster-on-amazon-ec2.html

    AnonymousSeptember 16, 2014 at 9:42 PM
    I am also doing the same trail of setting up a clustered setup on EC2 with 2 supervisors and one nimbus and one zookeeper.But for me only one supervisor instance is showing in the storm ui at a time.Both supervisors are able to connect and communicate with zookeeper but at a given time only one is being showed in the ui.There is a continuous switching between the supervisors in some random time difference. Need help.
    Thanks

    ]

    I am using AWS ec2 m3.large instance.

    Today I will try with other instance as well.

    Once again thanks for reply.

    Ajay

  4. Ankur October 14, 2015 at 4:25 pm Reply

    Amazing Tutorial Bro 🙂 ..I spent less than 4 hrs to set up Storm cluster for first time ..Thankyou so much 🙂

    • arpanr October 14, 2015 at 4:44 pm Reply

      Happy for you :-).
      The whole purpose of this blog entry was to help starters to setup the cluster quickly 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: