n this tutorial we will discuss how to use Spark as execution engine for hive. MapReduce is a default execution engine for Hive. But usually it’s very slow execution engine. Spark is better faster engine for running queries on Hive.
In this tutorial we will discuss you how to install Spark on Ubuntu VM. Spark do not have particular dependency on Hadoop or other tools. But if you are planning to use Spark with Hadoop then you should follow my Part-1, Part-2 and Part-3 tutorial which covers installation of Hadoop and Hive. Install Java and… Read More »
In this guide we will discuss how to install Hadoop HDFS on a single node cluster with Google Cloud Virtual Machine. Follow video tutorial below. To copy various commands, you can come back on this page. Prepare new server Create a new VM in google cloud with Ubuntu as base image. Create an instance with… Read More »
In this video tutorial I will show you how to install Cloudera Hadoop 5.14 version on google cloud virtual machine. Setup includes one master node and 2 slave nodes. Follow steps in video. Below are initial commands that you need for starting Cloudera installation. Download Cloudera Manager installer from cloudera site. Make installer file as… Read More »