Intall Hortonworks HDP hadoop platform with Ambari server

By | March 25, 2018

In this tutorial, we will discuss how to install Hortonworks hadoop platform with Ambari server. I am using Google Cloud VM for this tutorial. You can do same or create your own VM using either VirtualBox or AWS. You can also use your own Physical server if you wish. Steps will remain same except some OS level commands may change if you are not using CentOS.

Create Master node server VM on google cloud with highmem-8 core cpu and 52GB RAM atleast. I recommend high CPU/memory instance for master node. This will allow many Ambari Java services to run in background smoothly. Use CentOS 7 as base image. You can also use other Linux distribution. Nut you will need to adjust OS commands accordingly. SSH into master server. Update repository. Install wget command so that we can download Ambari setup using link.

sudo yum update
sudo yum install wget

Note master host name with below command. This will be needed later

hostname -f

Login as root account

sudo su root

Setup passwordless SSH for root account. To do this, generate ssh keys.

ssh-keygen
cd ~/.ssh
ls

You will see 2 files under ~/.ssh, id_rsa (private key) and id_rsa.pub (public key). Copy .pub(public key) into authorized_keys file on each host where you need to setup HDP.

First copy on master node itself.

cat id_rsa.pub >> authorized_keys
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys

Then copy on other nodes. Below command will prompt you for password. Give your google id password.

ssh-copy-id -i $HOME/.ssh/id_rsa.pub @node-ip-or-hostname

Allow root login for SSH on each node. Login and issue below commands on each node.

sudo vi /etc/ssh/sshd_config

Edit below line and set it to yes in above file.

PermitRootLogin yes

Save file and exit. Then restart SSH service

service sshd restart

Now try login to all server/nodes as root from master node.

ssh root@master-node-ip-or-hostname
ssh root@other-node-ip-or-hostname

It should login without asking password. It may prompt you to accept hostkey fingerprint. Select yes.

Copy private key file “id_rsa” to your local computer in notepad or any text editor and save it as local file, let’s say hadoop.pk. This will be needed later during HDP setup.

To view private file content, you can use below command. Copy content and paste it in hadoop.pk file on your local computer.

cat ~/.ssh/id_rsa

Disable SELinux on each node. Issue below commands and restart each node.

vi /etc/selinux/config
#Update below line to disabled
SELINUX=disabled

Reboot nodes for it to take effect.

Set the umask for your current login session as below.

umask 0022

Now we will install MySQL server on master node. This will be used by HDP as repository database for various services later. This step is optional. HDP can install default Derby database during setup if you don’t want to use MySQL server.

Install MySQL server

Download and add MySQL repository on master node, then update.

wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
sudo rpm -ivh mysql-community-release-el7-5.noarch.rpm
sudo yum update

Install MySQL as usual and start the service. During installation, you will be asked if you want to accept the results from the .rpm file’s GPG verification. If no error or mismatch occurs then press y.

sudo yum install mysql-server
sudo systemctl start mysqld
sudo mysql_secure_installation

You will be given the choice to change the MySQL root password. Input password.

Test mysql.

mysql -u root -p

On MySQL prompt, create a database for Ambari Repository, Hive Repository and Oozie Repository. These databases will be needed later by HDP setup.

mysql>create database ambari_repo_db;
mysql>create database hive_repo_db;
mysql>create database oozie_repo_db;

Allow mysql to be able to connect from all nodes. Execute below command for each node.

mysql>GRANT ALL ON *.* to root@ IDENTIFIED BY 'mysql_password';
mysql>GRANT ALL ON *.* to root@ IDENTIFIED BY 'mysql_password';

Quit mysql prompt. Install mysql connector on master node.

sudo yum install mysql-connector-java

Install Ambari

Download the Ambari repository file to a directory on master node.

sudo wget -nv http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.6.0.0/ambari.repo -O /etc/yum.repos.d/ambari.repo

Confirm that the repository is configured by checking the repo list.

sudo yum repolist

You should see Ambari listed in above command output

Install the Ambari server by issuing below command.

sudo yum install ambari-server

Run Ambari setup using below command.

sudo ambari-server setup
Customize user account for ambari-server daemon [y/n] (n)? n


It will display warning about ip-tables and port accessibility, select y to continue.
select JDK version to be installed. Use whichever is latest.

Enter advanced database configuration [y/n] (n), select y here to use mysql.

Enter advanced database configuration [y/n] (n)? y
Configuring database...
==============================================================================
Choose one of the following options:
[1] - PostgreSQL (Embedded)
[2] - Oracle
[3] - MySQL / MariaDB
[4] - PostgreSQL
[5] - Microsoft SQL Server (Tech Preview)
[6] - SQL Anywhere
[7] - BDB
==============================================================================
Enter choice (1): 3
Hostname (localhost):
Port (3306):
Database name (ambari): ambari_repo_db
Username (ambari): root
Enter Database Password (bigdata):
Re-enter password:
Configuring ambari database...

WARNING: Before starting Ambari Server, you must copy the MySQL JDBC driver JAR file to /usr/share/java and set property "server.jdbc.driver.path=[path/to/custom_jdbc
_driver]" in ambari.properties.
Press to continue.
Configuring ambari database...
Configuring remote database connection properties...
WARNING: Before starting Ambari Server, you must run the following DDL against the database to create the schema: /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CR
EATE.sql
Proceed with configuring remote database connection properties [y/n] (y)? y
Extracting system views...
ambari-admin-2.6.0.0.267.jar
...........
Adjusting ambari-server permissions and ownership...
Ambari Server 'setup' completed successfully.

Setup mysql connector for ambari.

sudo ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar

Go to MySQL prompt, and create repo tables.

mysql -u root -p
mysql>use ambari_repo_db;
mysql>source /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql;
mysql>quit;

Exit mysql prompt.

Run the following command on the Ambari Server host to start it.

sudo ambari-server start


To check the Ambari Server processes,

sudo ambari-server status

Log in to Ambari Web UI using below URL,
http://<VM IP Address>>:8080
Default user name/password: admin/admin

From the Ambari Welcome page, choose Launch Install Wizard.

Give Cluster name. This can be any name. Click Next.

Select HDP version. Use latest version. Click Next.

Give hostnames for each nodes. This should be name from output of "hostname -f" command on each node. On same screen, select SSH Private Key file that we created initially, i.e. hadoop.pk from your local computer. This is associated with root account.Then click "Register and confirm".

On next screen "Installation" progress will be displayed. Wait for it to finish and let it finish checking all hosts for potential problems. There could be some warnings which can be ignored. Click next. Ignore warning message.

Choose services on next screen that you want to install with HDP. Select only those services that you want to use. Don't install all services which may slowdown servers significantly or create resource problems.

On next screen select assign master nodes to various services. Use high capacity master node to host various services. Use other nodes mainly for HDFS related services.

Click Next. On next screen select slave nodes for hosting client services.

Click Next. On next screen various service configuration will be shown. Notice few services with red warnings. We need to clear those one by one.

First click on "Hive" service and select existing MySQL Database as configuration database. On this screen you should use 'hive_repo_db', database that we created earlier.

Similarly go to "Oozie" service and select existing MySQL database as 'oozie_repo_db', database that we created earlier.

Go to "Ambari Metrics" service and provide Grafana admin password. This can be any password.

Go to "SmartSense" service and click "Activity Analysis" tab. Give password for admin here. This can be any password.

At this point all red error warnings should be cleared. if there are more warnings then you should check that service. Click Next.

On next screen summary of all services will be displayed. Click Deploy. It should start installing various services on nodes. It will take a while to install all services. Wait for it to finish installing all services. Once done, click Next.

On next screen it will display summary of installation. Click on Complete. It should take you to Ambari Server home page.

Installation is now complete.

One thought on “Intall Hortonworks HDP hadoop platform with Ambari server

  1. venkatesh

    [root@master ~]# sudo ambari-server status
    Using python /usr/bin/python
    Ambari-server status
    Ambari Server not running. Stale PID File at: /var/run/ambari-server/ambari-server.pid
    [root@master ~]#

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *