Home / Computer Science / Big Data / Hadoop / Installation of Hadoop

Installation of Hadoop

Hadoop is run on Linux kernel. If you want to install Hadoop on windows OS, Cygwin need to install in your machine.
Cygwin is creating linux like environment in windows. Here is the link to get cygwin. https://cygwin.com/install.html
Hadoop can be installed in Multi Node cluster / single node cluster. any one we choose.
In this blog I posted Installation of single node cluster in your machine on Linux OS.
Step 1:
Java is mandatory to run Hadoop so check whether java is installed or not in your machine.
To check for java

$java –version

If you want to install Java in linux OS follow the below.

Step2:

$Sudo apt-get install oracle-java8-installer

Step3:
Hadoop requires SSH access to manage its nodes, i.e. remote machines plus your local machine where Hadoop runs. For single node cluster need to configure SSH access to local host for user.
Generate a public key

$ssh-keygen -t rsa -P “”

Then you have to enable access to your local machine.

$cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Step4:

Hadoop is free source from Apache software foundation go to that site and down load the Hadoop latest version which suitable to your machine.

Extract the downloaded tar file.

$tar xvfz hadoop-1.2.1.tar.gz

After extracting it do some following changes under Hadoop/conf.

Change 1:

Core-site.xml:
hadoop.tmp.dir
TEMPORARY-DIR-FOR-HADOOPDATASTORE
A base for other temporary directories
fs.default.name
hdfs://localhost:54310

Change2:

Mapared-site.xml:

mapred.job.tracker
localhost:54311

Change3:

Hdfs-site.xml:

dfs.replication
1

Step4:

Conf/slaves change to localhost

Step5:

Conf/master change to localhost

Step6:

Iit is essential to Setting up the environment variables for Hadoop and Java

For Temporary set up run the below command:

$export JAVA_HOME=/usr/lib/jvm/jdk1.8.0

$export HADOOP_COMMON_HOME=/home/hadoop/hadoop-install/hadoop-1.2.1

For permanent setting:

Open .bashrc and type end of the file append the below lines.

To open bashrc

$gedit ~/.bashrc

And type the below two lines

$export JAVA_HOME=/usr/lib/jvm/jdk1.8.0

$export HADOOP_COMMON_HOME=/home/hadoop/hadoop-install/hadoop-1.2.1

Once done the above run the below command.

$source ~/.bashrc

Step 7:

Format the Hadoop file system in Hadoop directory.

$./bin/hadoop namenode –format

Step 8:

Running the cluster.

$./bin/start-all.sh

Step 9:

To stop the cluster.

$./bin/stop-all.sh

About GSK

Hi, i am Santosh Gadagamma, a tutor in Software Engineering and an enthusiast for sharing knowledge in Computer Science and other domains. I developed this site to share knowledge to all the aspirants of technologies like, Java, C/C++, DBMS/RDBMS, Bootstrap, Big Data, Javascript, Android, Spring, Hibernate, Struts and all levels of software project design, development, deployment, and maintenance. As a programmer I believe that, "The world now needs computers to function." Hope, this site guides you as a learning tool towards greater heights. I believe that Education has no end points and i wish to learn more in the process of teaching you.

Check Also

Introduction to MapReduce

MapReduce is a Distributed computing programming model suitable for processing of huge data. Hadoop is …