In this tutorial, we will discuss how to install Hortonworks hadoop platform with Ambari server. I am using Google Cloud VM for this tutorial. You can do same or create your own VM using either VirtualBox or AWS. You can also use your own Physical server if you wish. Steps will remain same except some OS level commands may change if you are not using CentOS.
Create Master node server VM on google cloud with highmem-8 core cpu and 52GB RAM atleast. I recommend high CPU/memory instance for master node. This will allow many Ambari Java services to run in background smoothly. Use CentOS 7 as base image. You can also use other Linux distribution. Nut you will need to adjust OS commands accordingly. SSH into master server. Update repository. Install wget command so that we can download Ambari setup using link.
sudo yum update sudo yum install wget
Note master host name with below command. This will be needed later
Login as root account
sudo su root
Setup passwordless SSH for root account. To do this, generate ssh keys.
ssh-keygen cd ~/.ssh ls
You will see 2 files under ~/.ssh, id_rsa (private key) and id_rsa.pub (public key). Copy .pub(public key) into authorized_keys file on each host where you need to setup HDP.
First copy on master node itself.
cat id_rsa.pub >> authorized_keys chmod 700 ~/.ssh chmod 600 ~/.ssh/authorized_keys
Then copy on other nodes. Below command will prompt you for password. Give your google id password.
ssh-copy-id -i $HOME/.ssh/id_rsa.pub @node-ip-or-hostname
Allow root login for SSH on each node. Login and issue below commands on each node.
sudo vi /etc/ssh/sshd_config
Edit below line and set it to yes in above file.
Save file and exit. Then restart SSH service
service sshd restart
Now try login to all server/nodes as root from master node.
ssh root@master-node-ip-or-hostname ssh root@other-node-ip-or-hostname
It should login without asking password. It may prompt you to accept hostkey fingerprint. Select yes.
Copy private key file “id_rsa” to your local computer in notepad or any text editor and save it as local file, let’s say hadoop.pk. This will be needed later during HDP setup.
To view private file content, you can use below command. Copy content and paste it in hadoop.pk file on your local computer.
Disable SELinux on each node. Issue below commands and restart each node.
vi /etc/selinux/config #Update below line to disabled SELINUX=disabled
Reboot nodes for it to take effect.
Set the umask for your current login session as below.
Now we will install MySQL server on master node. This will be used by HDP as repository database for various services later. This step is optional. HDP can install default Derby database during setup if you don’t want to use MySQL server.
Install MySQL server
Download and add MySQL repository on master node, then update.
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm sudo rpm -ivh mysql-community-release-el7-5.noarch.rpm sudo yum update
Install MySQL as usual and start the service. During installation, you will be asked if you want to accept the results from the .rpm file’s GPG verification. If no error or mismatch occurs then press y.
sudo yum install mysql-server sudo systemctl start mysqld sudo mysql_secure_installation
You will be given the choice to change the MySQL root password. Input password.
mysql -u root -p
On MySQL prompt, create a database for Ambari Repository, Hive Repository and Oozie Repository. These databases will be needed later by HDP setup.
mysql>create database ambari_repo_db; mysql>create database hive_repo_db; mysql>create database oozie_repo_db;
Allow mysql to be able to connect from all nodes. Execute below command for each node.
mysql>GRANT ALL ON *.* to root@ IDENTIFIED BY 'mysql_password'; mysql>GRANT ALL ON *.* to root@ IDENTIFIED BY 'mysql_password';
Quit mysql prompt. Install mysql connector on master node.
sudo yum install mysql-connector-java
Download the Ambari repository file to a directory on master node.
sudo wget -nv http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/22.214.171.124/ambari.repo -O /etc/yum.repos.d/ambari.repo
Confirm that the repository is configured by checking the repo list.
sudo yum repolist
You should see Ambari listed in above command output
Install the Ambari server by issuing below command.
sudo yum install ambari-server
Run Ambari setup using below command.
sudo ambari-server setup Customize user account for ambari-server daemon [y/n] (n)? n
It will display warning about ip-tables and port accessibility, select y to continue.
select JDK version to be installed. Use whichever is latest.
Enter advanced database configuration [y/n] (n), select y here to use mysql.
Enter advanced database configuration [y/n] (n)? y Configuring database... ============================================================================== Choose one of the following options:  - PostgreSQL (Embedded)  - Oracle  - MySQL / MariaDB  - PostgreSQL  - Microsoft SQL Server (Tech Preview)  - SQL Anywhere  - BDB ============================================================================== Enter choice (1): 3 Hostname (localhost): Port (3306): Database name (ambari): ambari_repo_db Username (ambari): root Enter Database Password (bigdata): Re-enter password: Configuring ambari database... WARNING: Before starting Ambari Server, you must copy the MySQL JDBC driver JAR file to /usr/share/java and set property "server.jdbc.driver.path=[path/to/custom_jdbc _driver]" in ambari.properties. Press to continue. Configuring ambari database... Configuring remote database connection properties... WARNING: Before starting Ambari Server, you must run the following DDL against the database to create the schema: /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CR EATE.sql Proceed with configuring remote database connection properties [y/n] (y)? y Extracting system views... ambari-admin-126.96.36.199.267.jar ........... Adjusting ambari-server permissions and ownership... Ambari Server 'setup' completed successfully.
Setup mysql connector for ambari.
sudo ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar
Go to MySQL prompt, and create repo tables.
mysql -u root -p mysql>use ambari_repo_db; mysql>source /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql; mysql>quit;
Exit mysql prompt.
Run the following command on the Ambari Server host to start it.
sudo ambari-server start
To check the Ambari Server processes,
sudo ambari-server status
Log in to Ambari Web UI using below URL,
http://<VM IP Address>>:8080
Default user name/password: admin/admin
From the Ambari Welcome page, choose Launch Install Wizard.
Give Cluster name. This can be any name. Click Next.
Select HDP version. Use latest version. Click Next.
Give hostnames for each nodes. This should be name from output of "hostname -f" command on each node. On same screen, select SSH Private Key file that we created initially, i.e. hadoop.pk from your local computer. This is associated with root account.Then click "Register and confirm".
On next screen "Installation" progress will be displayed. Wait for it to finish and let it finish checking all hosts for potential problems. There could be some warnings which can be ignored. Click next. Ignore warning message.
Choose services on next screen that you want to install with HDP. Select only those services that you want to use. Don't install all services which may slowdown servers significantly or create resource problems.
On next screen select assign master nodes to various services. Use high capacity master node to host various services. Use other nodes mainly for HDFS related services.
Click Next. On next screen select slave nodes for hosting client services.
Click Next. On next screen various service configuration will be shown. Notice few services with red warnings. We need to clear those one by one.
First click on "Hive" service and select existing MySQL Database as configuration database. On this screen you should use 'hive_repo_db', database that we created earlier.
Similarly go to "Oozie" service and select existing MySQL database as 'oozie_repo_db', database that we created earlier.
Go to "Ambari Metrics" service and provide Grafana admin password. This can be any password.
Go to "SmartSense" service and click "Activity Analysis" tab. Give password for admin here. This can be any password.
At this point all red error warnings should be cleared. if there are more warnings then you should check that service. Click Next.
On next screen summary of all services will be displayed. Click Deploy. It should start installing various services on nodes. It will take a while to install all services. Wait for it to finish installing all services. Once done, click Next.
On next screen it will display summary of installation. Click on Complete. It should take you to Ambari Server home page.
Installation is now complete.