1. Hadoop 2.6 Multi Node Cluster Setup Tutorial – Objective
In this tutorial on Install Hadoop 2.6 Multi node cluster setup on Ubuntu, we will learn how to install a Hadoop 2.6 multi-node cluster setup with YARN. We will learn various steps for Hadoop 2.6 installing on Ubuntu to setup Hadoop multi-node cluster. We will start with platform requirements for Hadoop 2.6 Multi Node Cluster Setup on Ubuntu, prerequisites to install Hadoop on master and slave, various software required for installing Hadoop, how to start Hadoop cluster and how to stop Hadoop cluster. It will also cover how to install Hadoop CDH5 to help you in programming in Hadoop.
2. Hadoop 2.6 Multi Node Cluster Setup
Let us now start with steps to setup Hadoop multi-node cluster in Ubuntu. Let us first understand the recommended platform for installing Hadoop on the multi-node cluster in Ubuntu.
2.1. Recommended Platform for Hadoop 2.6 Multi Node Cluster Setup
2.2. Install Hadoop on Master
Let us now start with installing Hadoop on master node in the distributed mode.
I. Prerequisites for Hadoop 2.6 Multi Node Cluster Setup
Let us now start with learning the prerequisites to install Hadoop:
a. Add Entries in hosts file
Edit hosts file and add entries of master and slaves:
(NOTE: In place of MASTER-IP, SLAVE01-IP, SLAVE02-IP put the value of the corresponding IP)
b. Install Java 8 (Recommended Oracle Java)
sudo apt-get install python-software-properties
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
c. Configure SSH
sudo apt-get install openssh-server openssh-client
ssh-keygen -t rsa -P ""
Copy the content of .ssh/id_rsa.pub (of master) to .ssh/authorized_keys (of all the slaves as well as master)
II. Install Apache Hadoop in distributed mode
Let us now learn how to download and install Hadoop?
a. Download Hadoop
Below is the link to download Hadoop 2.x.
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.3.2.tar.gz
b. Untar Tarball
tar xzf hadoop-2.5.0-cdh5.3.2.tar.gz
(Note: All the required jars, scripts, configuration files, etc. are available in HADOOP_HOME directory (hadoop-2.5.0-cdh5.3.2))
III. Hadoop multi-node cluster setup Configuration
Let us now learn how to setup Hadoop configuration while installing Hadoop?
a. Edit .bashrc
Edit .bashrc file located in user’s home directory and add following environment variables:
(Note: After above step restart the Terminal/Putty so that all the environment variables will come into effect)
b. Check environment variables
Check whether the environment variables added in the .bashrc file are available:
(It should not give error: command not found)
c. Edit hadoop-env.sh
Edit configuration file hadoop-env.sh (located in HADOOP_HOME/etc/hadoop) and set JAVA_HOME:
export JAVA_HOME=<path-to-the-root-of-your-Java-installation> (eg: /usr/lib/jvm/java-8-oracle/)
d. Edit core-site.xml
Edit configuration file core-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:
Note: /home/ubuntu/hdata is a sample location; please specify a location where you have Read Write privileges
e. Edit hdfs-site.xml
Edit configuration file hdfs-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:
f. Edit mapred-site.xml
Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:
g. Edit yarn-site.xml
Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:
h. Edit salves
Edit configuration file slaves (located in HADOOP_HOME/etc/hadoop) and add following entries:
“Hadoop is set up on Master, now setup Hadoop on all the Slaves”
Refer this guide to learn Hadoop Features and design principles.
2.3. Install Hadoop On Slaves
I. Setup Prerequisites on all the slaves
Run following steps on all the slaves:
II. Copy configured setups from master to all the slaves
a. Create tarball of configured setup
tar czf hadoop.tar.gz hadoop-2.5.0-cdh5.3.2
(NOTE: Run this command on Master)
b. Copy the configured tarball on all the slaves
scp hadoop.tar.gz slave01:~
(NOTE: Run this command on Master)
scp hadoop.tar.gz slave02:~
(NOTE: Run this command on Master)
c. Un-tar configured Hadoop setup on all the slaves
tar xzf hadoop.tar.gz
(NOTE: Run this command on all the slaves)
“Hadoop is set up on all the Slaves. Now Start the Cluster”
2.4. Start the Hadoop Cluster
Let us now learn how to start Hadoop cluster?
I. Format the name node
bin/hdfs namenode -format
(Note: Run this command on Master)
(NOTE: This activity should be done once when you install Hadoop, else it will delete all the data from HDFS)
II. Start HDFS Services
sbin/start-dfs.sh
(Note: Run this command on Master)
III. Start YARN Services
sbin/start-yarn.sh
(Note: Run this command on Master)
IV. Check for Hadoop services
a. Check daemons on Master
b. Check daemons on Slaves
2.5. Stop The Hadoop Cluster
Let us now see how to stop the Hadoop cluster?
I. Stop YARN Services
sbin/stop-yarn.sh
(Note: Run this command on Master)
II. Stop HDFS Services
sbin/stop-dfs.sh
(Note: Run this command on Master)
This is how we do Hadoop 2.6 multi node cluster setup on Ubuntu.
1 videos|14 docs
|
1. What is Hadoop? |
2. What is a multi-node cluster setup in Hadoop? |
3. How do I set up a multi-node Hadoop cluster? |
4. What are the benefits of a multi-node Hadoop cluster? |
5. What are the minimum hardware requirements for setting up a multi-node Hadoop cluster? |
|
Explore Courses for Software Development exam
|