Raspberry Pi Hadoop Cluster
I am currently in the process of learning Hadoop architecture,
administration and the MapReduce programming model. I started reading about
Hadoop and took free online courses but there is something missing. I wanted to
try out what I read or what I was told on those training. In some exercises in
the training a VM was used as a single node Hadoop server. But for me it
doesn’t make sense, so I tried out setting up more VM and configured them into
a Hadoop cluster. But still the experience was not very satisfying because it
lacks the touch of the hardware. I wasn’t really running a cluster but simply
just a bunch of virtual machines connected together in a virtual network. So I
thought “How about I build a cluster of cheap computers.” In fact, that’s what Hadoop
was designed for, to run on a cluster of commodity hardware. I tried googling
around about cheap cluster computer and found a few blogs and videos about guys
who ran MPI and Hadoop on a mini cluster of Raspberry Pi boards. So I decided
to build my own mini cluster of Raspberry Pis. After all, the best way to learn
new things is to get your hand dirty.
There are a couple of Raspberry
Pi boards laying around my desk at home. I used them in some experiments
before. But I needed more, so I ordered 3 more of these boards. Raspberry Pi is
a S$50 single board computer with ARM7 1GhzÂ
dual-core CPU, 1GB of RAM and a microSD slot for storage.
I also bought SD cards and, USB power cables, an 8-port
network switch and a high-current USB power supply/charger. Luckily I was able
to find all off them locally in the neighbourhood shops except for the Raspberry
Pi which I ordered online and received the next day.
I started thinking about how to clump this boards together
in a rack-like structure where it is easy to cable them up to a network switch.
I found few ideas online about using some stand-off bolts and nuts and acrylic
boards to stack them up together. The problem is I don’t have these materials
so I started walking around the house to come up with ideas and to look for
materials. The mounting holes in the Raspberry Pi are 2mm wide so I started
looking for bolt and screws of this size but didn’t find any. Then I went to
the laundry area where I found a wire-made clothes hanger. The wire core metal
is around 2mm diameter covered with a PVC plastic insulator around 1 mm thick.
I grabbed it and decided to build a mini rack out of it. I created a half round
loop and literally stitched the boards together.
The network switch I bought is powered by a 5V power supply
at 600mA so the USB power supply can actually power it.
I downloaded a Linux distro called Raspbian
Jessie it is a lightweight variant of Debian linux intended for headless
Raspberry Pi server. It’s a stripped down version of Raspbian Wheezy without
the UI. I then updated it with latest libraries and installed Java 8 JDK.
I followed the steps by that was described in this blog but
with some extra steps to configure a second interface card with a Wi-Fi dongle
and voila. I now have a Hadoop cluster running.
I later bought a set of acrylic stackable case to make the
rack more stable and look good. I stripped out the board of the cheap network switch (I bought online for $7 from china) and mounted it the one of the acrylic stackable case so it will look uniform. Now I am hadooping with this little beast. I
later upgraded to Hadoop 2 and installed Apache spark for future Apache Spark experiments and adventures.



