This article will show how you can install Apache Oozie on hadoop 2.8 single node cluster. Oozie is a workflow scheduler system to manage Apache Hadoop jobs. I assume, you have followed previous articles on how to setup hadoop single node cluster or have a Hadoop server already running.
Apache Maven should be installed first. You can refer to the installation instructions on below link to install Maven.
http://hadooptutorials.info/2017/11/13/installing-apache-maven-on-ubuntu/
Download Oozie 4.3 source tarball from the Apache URL and save the tarball to any directory. At the time of writing this article latest version is 4.3. You should get new version if there are new versions available on Apache site.
cd ~/Downloads
wget http://apache.mirrors.hoobly.com/oozie/4.3.0/oozie-4.3.0.tar.gz
tar -zxf oozie-4.3.0.tar.gz
Compile Oozie to create binary.
cd oozie-4.3.0/bin
./mkdistro.sh -DskipTests
For more information on various compilation options visit below link,
https://oozie.apache.org/docs/4.3.0/DG_QuickStart.html#Building_Oozie
On successful build, Oozie binary is available in target folder, in my case it is /root/Downloads/oozie-4.3.0/distro/target
Extract oozie binary and copy it in /usr/local folder.
cd /root/Downloads/oozie-4.3.0/distro/target
tar -zxvf oozie-4.3.0-distro.tar.gz
cd oozie-4.3.0-distro
mkdir -p /usr/local/oozie
mv oozie-4.3.0 /usr/local/oozie
cd /usr/local/oozie/oozie-4.3.0
mkdir libext
Add below line in ~/.bashrc file. This will add environment variable for oozie home directory
export OOZIE_HOME=/usr/local/oozie/oozie-4.3.0
Reload environment variables.
source ~/.bashrc
Copy below jar files from hadoop home directory to $OOZIE_HOME/libext folder. This is a necessary step to avoid errors.
cp $HADOOP_HOME/share/hadoop/common/*.jar $OOZIE_HOME/libext/
cp $HADOOP_HOME/share/hadoop/common/lib/*.jar $OOZIE_HOME/libext/
cp $HADOOP_HOME/share/hadoop/mapreduce/*.jar $OOZIE_HOME/libext/
cp $HADOOP_HOME/share/hadoop/mapreduce/lib/*.jar $OOZIE_HOME/libext/
cp $HADOOP_HOME/share/hadoop/hdfs/*.jar $OOZIE_HOME/libext/
cp $HADOOP_HOME/share/hadoop/hdfs/lib/*.jar $OOZIE_HOME/libext/
cp $HADOOP_HOME/share/hadoop/yarn/*.jar $OOZIE_HOME/libext/
cp $HADOOP_HOME/share/hadoop/yarn/lib/*.jar $OOZIE_HOME/libext/
Note: I was getting below error without copying various jar files from Hadoop home directory to OOZIE libext folder.
Error:
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli/ParseException
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.cli.ParseException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
After copying various Hadoop jars as mentioned above, this error went away.
Now create WAR file necessary to run OOZIE server.
./bin/oozie-setup.sh prepare-war
War file should be created successfully now
Add/update below lines in $HADOOP_CONF_DIR/core-site.xml. These configuration will use root as impersonating user. If you are using some other user to run oozie jobs then change root to whatever user name that you are using.
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
Start OOZIE server.
cd $OOZIE_HOME
bin/oozied.sh start
Hopefully OOZIE server will start without any errors.
You can verify if OOZIE web console is running with below command,
bin/oozie admin -oozie http://localhost:11000/oozie -status
Or try accessing http://localhost:11000
On accessing http://localhost:11000/oozie/, I got below error
HTTP Status 500 - java.lang.NullPointerException
type Exception report
message java.lang.NullPointerException
description The server encountered an internal error that prevented it from fulfilling this request.
exception
org.apache.jasper.JasperException: java.lang.NullPointerException
org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:542)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:370)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:321)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:267)
javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
org.apache.oozie.servlet.AuthFilter$2.doFilter(AuthFilter.java:171)
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:636)
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:588)
org.apache.oozie.servlet.AuthFilter.doFilter(AuthFilter.java:176)
org.apache.oozie.servlet.HostnameFilter.doFilter(HostnameFilter.java:86)
root cause
java.lang.NullPointerException
org.apache.jsp.index_jsp._jspInit(index_jsp.java:25)
org.apache.jasper.runtime.HttpJspBase.init(HttpJspBase.java:52)
org.apache.jasper.servlet.JspServletWrapper.getServlet(JspServletWrapper.java:164)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:340)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:321)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:267)
javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
org.apache.oozie.servlet.AuthFilter$2.doFilter(AuthFilter.java:171)
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:636)
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:588)
org.apache.oozie.servlet.AuthFilter.doFilter(AuthFilter.java:176)
org.apache.oozie.servlet.HostnameFilter.doFilter(HostnameFilter.java:86)
To solve this error, we need to delete or rename below JAR file from WEB-INF folder.
cd $OOZIE_HOME/oozie-server/webapps/oozie/WEB-INF/lib
mv jsp-api-2.1.jar jsp-api-2.1.xxjar
Now restart oozie server
bin/oozied.sh stop
Above line may throw some error saying unable to remove pid file. In that case we may need to remove it manually. Use below commands to do that.
rm oozie-server/temp/*.pid
bin/oozied.sh start
Try accessing http://localhost:11000 again and see if it works this time. It should work.
Once oozie server is up and running, it’s time to run few samples provided with oozie setup.
Run Examples
To run bundled examples with oozie, first extract them.
cd $OOZIE_HOME
tar -zxvf oozie-examples.tar.gz
cd examples/apps/map-reduce
We will try to run basic Map Reduce sample. Edit job properties file for Map Reduce job.
vi job.properties
Update below 2 lines as per your hadoop configuration.
nameNode=hdfs://localhost:54310
jobTracker=localhost:8032
Please note, for hadoop version 2.6 and above use jobTracker=localhost:8032. For old versions jobTracker=localhost:54310. These are hadoop defaults. If you changed job tracker port in yarn-site.xml then you should use that value here.
Copy examples on hdfs,
hadoop fs -copyFromLocal $OOZIE_HOME/examples hdfs://localhost:54310/user/root/
To run example,
cd $OOZIE_HOME
bin/oozie job -oozie http://localhost:11000/oozie -config $OOZIE_HOME/examples/apps/map-reduce/job.properties -run
Note: For me, on running map-reduce example it was throwing below error in Oozie web-console
File /user/root/share/lib does not exist
To solve this, run below command that will install oozie sharelib in HDFS
cd $OOZIE_HOME
bin/oozie-setup.sh sharelib create -fs hdfs://localhost:54310 -locallib oozie-sharelib-4.3.0.tar.gz
Then edit oozie-site.xml and add/update configuration as below,
<property>
<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*=/usr/local/hadoop/etc/hadoop</value>
</property>
<property>
<name>oozie.service.ProxyUserService.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>oozie.service.ProxyUserService.proxyuser.root.groups</name>
<value>*</value>
</property>
Now restart oozie server
bin/oozied.sh stop
Above line may throw some error saying unable to remove pid file. So we may need to remove it manually.
rm oozie-server/temp/*.pid
bin/oozied.sh start
Now try running example again,
cd $OOZIE_HOME
bin/oozie job -oozie http://localhost:11000/oozie -config $OOZIE_HOME/examples/apps/map-reduce/job.properties -run
Hopefully this time it runs successfully and shows you below screen with RUNNING as status.
after following the steps i am getting this exception –
javax.servlet.ServletException: java.lang.NoSuchMethodError: org.eclipse.jdt.internal.compiler.CompilationResult.getProblems()[Lorg/eclipse/jdt/core/compiler/IProblem;
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:273)
javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
org.apache.oozie.servlet.AuthFilter$2.doFilter(AuthFilter.java:171)
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:572)
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:542)
org.apache.oozie.servlet.AuthFilter.doFilter(AuthFilter.java:176)
org.apache.oozie.servlet.HostnameFilter.doFilter(HostnameFilter.java:86)
at web UI of oozie(http://localhost:11000/oozie)