IT & Software  >  Hadoop Tutorials: Brief Introduction  >  Installation of Hadoop 3.x on Ubuntu on Single Node Cluster

Installation of Hadoop 3.x on Ubuntu on Single Node Cluster Notes | Study Hadoop Tutorials: Brief Introduction - IT & Software

Document Description: Installation of Hadoop 3.x on Ubuntu on Single Node Cluster for IT & Software 2022 is part of Hadoop Tutorials: Brief Introduction preparation. The notes and questions for Installation of Hadoop 3.x on Ubuntu on Single Node Cluster have been prepared according to the IT & Software exam syllabus. Information about Installation of Hadoop 3.x on Ubuntu on Single Node Cluster covers topics like and Installation of Hadoop 3.x on Ubuntu on Single Node Cluster Example, for IT & Software 2022 Exam. Find important definitions, questions, notes, meanings, examples, exercises and tests below for Installation of Hadoop 3.x on Ubuntu on Single Node Cluster.

Introduction of Installation of Hadoop 3.x on Ubuntu on Single Node Cluster in English is available as part of our Hadoop Tutorials: Brief Introduction for IT & Software & Installation of Hadoop 3.x on Ubuntu on Single Node Cluster in Hindi for Hadoop Tutorials: Brief Introduction course. Download more important topics related with notes, lectures and mock test series for IT & Software Exam by signing up for free. IT & Software: Installation of Hadoop 3.x on Ubuntu on Single Node Cluster Notes | Study Hadoop Tutorials: Brief Introduction - IT & Software
1 Crore+ students have signed up on EduRev. Have you?

1. Objective

In this tutorial on Installation of Hadoop 3.x on Ubuntu, we are going to learn steps for setting up a pseudo-distributed, single-node Hadoop 3.x cluster on Ubuntu. We will learn steps like how to install java, how to install SSH and configure passwordless SSH, how to download Hadoop, how to setup Hadoop configurations like .bashrc file, hadoop-env.sh, core-site.xml, hdfs-site.xml, mapred-site.xml, YARN-site.xml, how to start the Hadoop cluster and how to stop the Hadoop services.

Learn step by step installation of Hadoop 2.7.x on Ubuntu.


2. Installation of Hadoop 3.x on Ubuntu

Before we start with Hadoop 3.x installation on Ubuntu, let us understand key features that have been added in Hadoop 3 that makes the comparison between Hadoop 2 and Hadoop 3.

2.1. Java 8 installation

Hadoop requires working java installation. Let us start with steps for installing java 8:

a. Install Python Software Properties

sudo apt-get install python-software-properties

b. Add Repository

sudo add-apt-repository ppa:webupd8team/java

c. Update the source list

sudo apt-get update

d. Install Java 8

sudo apt-get install oracle-java8-installer

e. Check if java is correctly installed

java -version

2.2. Configure SSH

SSH is used for remote login. SSH is required in Hadoop to manage its nodes, i.e. remote machines and local machine if you want to use Hadoop on it. Let us now see SSH installation of Hadoop 3.x on Ubuntu:

a. Installation of passwordless SSH

  1. sudo apt-get install ssh
  2. sudo apt-get install pdsh

b. Generate Key Pairs

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

c. Configure passwordless ssh

cat ~/.ssh/id_rsa.pub>>~/.ssh/authorized_keys

e. Change the permission of file that contains the key

chmod 0600 ~/.ssh/authorized_keys

f. check ssh to the localhost

ssh localhost

2.3. Install Hadoop

a. Download Hadoop

http://redrockdigimark.com/apachemirror/hadoop/common/hadoop-3.0.0-alpha2/hadoop-3.0.0-alpha2.tar.gz

(Download the latest version of Hadoop hadoop-3.0.0-alpha2.tar.gz)

b. Untar Tarball

tar -xzf hadoop-3.0.0-alpha2.tar.gz

2.4. Hadoop Setup Configuration

a. Edit .Bashrc

Open .bashrc

nano ~/.bashrc

Edit .bashrc:

Edit .bashrc file is located in user’s home directory and adds following parameters:

  1. export HADOOP_PREFIX="/home/dataflair/hadoop-3.0.0-alpha2"
  2. export PATH=$PATH:$HADOOP_PREFIX/bin
  3. export PATH=$PATH:$HADOOP_PREFIX/sbin
  4. export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
  5. export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
  6. export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
  7. export YARN_HOME=${HADOOP_PREFIX}

Then run

Source ~/.bashrc

b. Edit hadoop-env.sh

Edit configuration file hadoop-env.sh (located in HADOOP_HOME/etc/hadoop) and set JAVA_HOME:

export JAVA_HOME=/usr/lib/jvm/java-8-oracle/

c. Edit core-site.xml
Edit configuration file core-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:

  1. <configuration> 
  2. <property> 
  3. <name>fs.defaultFS</name>
  4. <value>hdfs://localhost:9000</value>
  5. </property> 
  6. <property> 
  7. <name>hadoop.tmp.dir</name> 
  8. <value>/home/dataflair/hdata</value>
  9. </property> 
  10. </configuration>

d. Edit hdfs-site.xml

Edit configuration file hdfs-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:

  1. <configuration>
  2. <property>
  3. <name>dfs.replication</name>
  4. <value>1</value>
  5. </property>
  6. </configuration>

e. Edit mapred-site.xml

If mapred-site.xml file is not available, then use

cp mapred-site.xml.template mapred-site.xml

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:

  1. <configuration>
  2. <property>
  3. <name>mapreduce.framework.name</name>
  4. <value>yarn</value>
  5. </property>
  6. </configuration>

f. Yarn-site.xml

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:

  1. <configuration>
  2. <property>
  3. <name>yarn.nodemanager.aux-services</name>
  4. <value>mapreduce_shuffle</value>
  5. </property>
  6. <property>
  7. <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  8. <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  9. </property>
  10. </configuration>

Test your Hadoop knowledge with this Big data Hadoop quiz.

2.5. How to Start the Hadoop services

Let us now see how to start the Hadoop cluster:

The first step to starting up your Hadoop installation is formatting the Hadoop filesystem which is implemented on top of the local filesystem of your “cluster”. This is done as follows:

a. Format the namenode

bin/hdfs namenode -format

NOTE: This activity should be done once when you install Hadoop and not for running Hadoop filesystem, else it will delete all your data from HDFS

b. Start HDFS Services

sbin/start-dfs.sh

It will give an error at the time of start HDFS services then use:

echo "ssh" | sudo tee /etc/pdsh/rcmd_default

c. Start YARN Services

sbin/start-yarn.sh

d. Check how many daemons are running

Let us now see whether expected Hadoop processes are running or not:

  1. jps
  2. 2961 ResourceManager
  3. 2482 DataNode
  4. 3077 NodeManager
  5. 2366 NameNode
  6. 2686 SecondaryNameNode
  7. 3199 Jps


2.6. How to Stop the Hadoop services

Let us learn how to stop Hadoop services now:

a. Stop YARN services

sbin/stop-yarn.sh

b. Stop HDFS services

sbin/stop-dfs.sh

Note:

Browse the web interface for the NameNode; by default, it is available at:

NameNode – http://localhost:9870/

Browse the web interface for the ResourceManager; by default, it is available at:

ResourceManager – http://localhost:8088/

The document Installation of Hadoop 3.x on Ubuntu on Single Node Cluster Notes | Study Hadoop Tutorials: Brief Introduction - IT & Software is a part of the IT & Software Course Hadoop Tutorials: Brief Introduction.
All you need of IT & Software at this link: IT & Software
Download as PDF

Download free EduRev App

Track your progress, build streaks, highlight & save important lessons and more!

Related Searches

MCQs

,

Previous Year Questions with Solutions

,

Installation of Hadoop 3.x on Ubuntu on Single Node Cluster Notes | Study Hadoop Tutorials: Brief Introduction - IT & Software

,

Semester Notes

,

Sample Paper

,

Installation of Hadoop 3.x on Ubuntu on Single Node Cluster Notes | Study Hadoop Tutorials: Brief Introduction - IT & Software

,

ppt

,

Extra Questions

,

Objective type Questions

,

Important questions

,

Viva Questions

,

practice quizzes

,

Summary

,

mock tests for examination

,

shortcuts and tricks

,

video lectures

,

Free

,

past year papers

,

study material

,

pdf

,

Installation of Hadoop 3.x on Ubuntu on Single Node Cluster Notes | Study Hadoop Tutorials: Brief Introduction - IT & Software

,

Exam

;