Intall Hortonworks HDP hadoop platform with Ambari server

Create Master node on google cloud with highmem-8core cpu and 52GB RAM
SSH into node

sudo yum update
sudo yum install wget

Note master host name with below command. This will be needed later

hostname -f

Login as root account

sudo su root

Setup passwordless SSH for root account
Generate ssh keys.

cd ~/.ssh

You will see 2 files under ~/.ssh, id_rsa (private key) and (public key). Copy .pub(public key) into authorized_keys file on each host where you need to setup HDP.

First copy on master node itself.

cat >> authorized_keys
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys

Then copy on other nodes. Below command will prompt you for password. Give your google id password.

ssh-copy-id -i $HOME/.ssh/ <your google id>@node-ip-or-hostname

Allow root login for SSH on each node. Login and issue below commands on each node.

vi /etc/ssh/sshd_config

edit below line and set it to yes

PermitRootLogin yes

Save file and exit. Then restart SSH serivce

service sshd restart

Now try login to server/nodes as root

ssh root@master-node-ip-or-hostname
ssh root@other-node-ip-or-hostname

It should login without asking password.

Copy private key file “id_rsa” to your local computer in notepad or any text editor and save it as local file, let’s say hadoop.txt. This will be needed later during HDP setup.

To view private file content,

cat ~/.ssh/id_rsa

Disable SELinux on each node. Issue below commands and restart each node.

vi /etc/selinux/config
Update below line to disabled

Reboot nodes for it to take effect.

Set the umask for your current login session as below

umask 0022

Now we will install mysql server on master node. This will be used by HDP as repository database for various services later. This step is optional. HDP can install default Derby database during setup if you don’t want to use mysql server.

Install mysql server

Download and add mysql repository, then update.

sudo rpm -ivh mysql-community-release-el7-5.noarch.rpm
sudo yum update

Install MySQL as usual and start the service. During installation, you will be asked if you want to accept the results from the .rpm file’s GPG verification. If no error or mismatch occurs, enter y.

sudo yum install mysql-server
sudo systemctl start mysqld
sudo mysql_secure_installation

You will be given the choice to change the MySQL root password.

Test mysql

mysql -u root -p

On mysql prompt, create a database for Ambari Repository, Hive Repository and Oozie Repository. These databases will be needed later by HDP setup

mysql>create database ambari_repo_db
mysql>create database hive_repo_db;
mysql>create database oozie_repo_db;

Allow mysql to be able to connect from all nodes. Execute below command for each node.

mysql>GRANT ALL ON *.* to root@<master ip> IDENTIFIED BY 'mysql_password';
mysql>GRANT ALL ON *.* to root@<node ip> IDENTIFIED BY 'mysql_password';

Install mysql connector on master node.

sudo yum install mysql-connector-java

Install Ambari

Download the Ambari repository file to a directory on your installation host.

sudo wget -nv -O /etc/yum.repos.d/ambari.repo

Confirm that the repository is configured by checking the repo list.

sudo yum repolist

You should see Ambari listed in above command output

Install the Ambari bits.

sudo yum install ambari-server

Run Ambari setup

sudo ambari-server setup
Customize user account for ambari-server daemon [y/n] (n)? n

It will display warning about ip-tables and port accessibility, select y to continue.
select JDK version to be installed. Use whichever is latest.

Enter advanced database configuration [y/n] (n), select y here to use mysql

Enter advanced database configuration [y/n] (n)? y
Configuring database...
Choose one of the following options:
[1] - PostgreSQL (Embedded)
[2] - Oracle
[3] - MySQL / MariaDB
[4] - PostgreSQL
[5] - Microsoft SQL Server (Tech Preview)
[6] - SQL Anywhere
[7] - BDB
Enter choice (1): 3
Hostname (localhost):
Port (3306):
Database name (ambari): ambari_repo_db
Username (ambari): root
Enter Database Password (bigdata):
Re-enter password:
Configuring ambari database...

WARNING: Before starting Ambari Server, you must copy the MySQL JDBC driver JAR file to /usr/share/java and set property "server.jdbc.driver.path=[path/to/custom_jdbc
Press <enter> to continue.
Configuring ambari database...
Configuring remote database connection properties...
WARNING: Before starting Ambari Server, you must run the following DDL against the database to create the schema: /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CR
Proceed with configuring remote database connection properties [y/n] (y)? y
Extracting system views...
Adjusting ambari-server permissions and ownership...
Ambari Server 'setup' completed successfully.

Setup mysql connector for ambari

sudo ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar

Go to mysql prompt, and create repo tables.

mysql -u root -p
mysql>use ambari_repo_db;
mysql>source /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql;

Exit mysql prompt

Run the following command on the Ambari Server host:

sudo ambari-server start

To check the Ambari Server processes:

sudo ambari-server status

To log in to Ambari Web using a web browser:

http://<master node ip>:8080

default user name/password: admin/admin

From the Ambari Welcome page, choose Launch Install Wizard.

Give Cluster name. This can be any name. Click Next.

Select HDP version. Use latest version. Click Next.

Give hostnames for each nodes. This should be name from output of “hostname -f” command on each node.
On same screen, select SSH Private Key file that we created initially, i.e. hadoop.txt. This is associated with root account.Then click “Register and confirm”.

On next screen “Installation” progress will be displayed. Wait for it to finish and let it finish checking all hosts for potential problems. There could be some warnings which can be ignored. Click next. Ignore warning message.

Choose services on next screen that you want to install with HDP. Select only those services that you want to use. Don’t install all services which may slowdown servers significantly or create resource problems.

On next screen select assign master nodes to various services. Use high capacity master node to host various services. Use other nodes mainly for HDFS related services.

Click Next. On next screen select slave nodes for hosting client services.

Click Next. On next screen various service configuration will be shown. Notice few services with red warnings. We need to clear those one by one.

First click on “Hive” service and select existing MySQL Database as configuration database. On this screen you should use ‘hive_repo_db’, database that we created earlier.

Similarly go to “Oozie” service and select existing MySQL database as ‘oozie_repo_db’, database that we created earlier.

Go to “Ambari Metrics” service and provide Grafana admin password. This can be any password.

Go to “SmartSense” service and click “Activity Analysis” tab. Give password for admin here. This can be any password.

At this point all red error warnings should be cleared. if there are more warnings then you should check that service. Click Next.

On next screen summary of all services will be displayed. Click Deploy. It should start installing various services on nodes. It will take a while to install all services. Wait for it to finish installing all services. Once done, click Next.

On next screen it will display summary of installation. Click on Complete. It should take you to Ambari Server home page.

Installation complete.