NOTE: This post deals with only a minimal single-node cluster setup. Other posts will deal with various issues related to resource allocation on a multi-node cluster.
While setting up Hadoop 2.2.0 on Ubuntu 12.04.3 LTS 64-bit (VM on Hyper-V), I had to refer to multiple resources and had to overcome some roadblocks. The procedure that worked for me is shared here in three posts:
- This post describes software setup and configuration.
- Part 2 describes starting up processes and running an example.
- Part 3 describes building native libraries for the 64-bit system to give a noticeable performance boost. The downloaded distribution contains 32-bit binaries and the alternative Java libraries can’t match this performance.
Before performing the setup steps below, I had to ensure that SSH and JDK6 were installed.
sudo apt-get install ssh sudo apt-get install openjdk-6-jdk
Create Hadoop User
It’s recommended that all Hadoop-related work be performed while logged…
View original post 557 more words