n this tutorial we will discuss how to use Spark as execution engine for hive. MapReduce is a default execution engine for Hive. But usually it’s very slow execution engine. Spark is better faster engine for running queries on Hive.
In this tutorial we will discuss you how to install Spark on Ubuntu VM. Spark do not have particular dependency on Hadoop or other tools. But if you are planning to use Spark with Hadoop then you should follow my Part-1, Part-2 and Part-3 tutorial which covers installation of Hadoop and Hive. Install Java and… Read More »
In this part we will discuss how to install HIVE on Hadoop HDFS file system.
In this part we will discuss how to install a new data node on existing Hadoop setup. Follow step by step guide in video tutorial.
In this guide we will discuss how to install Hadoop HDFS on a single node cluster with Google Cloud Virtual Machine. Follow video tutorial below. To copy various commands, you can come back on this page. Prepare new server Create a new VM in google cloud with Ubuntu as base image. Create an instance with… Read More »
In this tutorial, we will discuss how to install Hortonworks hadoop platform with Ambari server. I am using Google Cloud VM for this tutorial. You can do same or create your own VM using either VirtualBox or AWS. You can also use your own Physical server if you wish. Steps will remain same except some… Read More »
In this video tutorial I will show you how to install Cloudera Hadoop 5.14 version on google cloud virtual machine. Setup includes one master node and 2 slave nodes. Follow steps in video. Below are initial commands that you need for starting Cloudera installation. Download Cloudera Manager installer from cloudera site. Make installer file as… Read More »
This article will guide you on how to install Apache Maven on Ubuntu. Same instructions could be followed for other Linux distributions as well.
This article will show you how to install Hue on a hadoop cluster. It assumes that you have a working hadoop cluster along with Hive installed and working. If not then follow various articles on this site to install hadoop and hive first.
This post will describe how to set password-less SSH access on a Linux server for a particular user. Login to Linux server with a username and password first. Generate SSH key for this user using below command Above command will generate 2 files in ~/.ssh/ directory. (1)id_rsa and (2)id_rsa.pub. ~/.ssh directory is for current logged… Read More »