Error while running Hive Action in Oozie

11,232

I figured out what was going wrong!

The class org/apache/hadoop/hive/cli/CliDriver is required for execution of a Hive Action. This much is obvious from the error message. This class is within this jar file: hive-cli-0.7.1-cdh3u5.jar. (In my case cdh3u5 in my cloudera version).

Oozie checks for this jar in the ShareLib directory. The location of this directory is usually configured in hive-site.xml, with the property name as oozie.service.WorkflowAppService.system.libpath, so Oozie should find the jar easily.

But in my case, hive-site.xml did not include this property, so Oozie didn't know where to look for this jar, hence the java.lang.NoClassDefFoundError.

To resolve this, I had to include a parameter in my job.properties file to point oozie to the location of the ShareLib directory, as follows: oozie.libpath=${nameNode}/user/oozie/share/lib. (depends on where SharedLib directory is configured on your cluster).

This got rid of the error!

Share:
11,232
Chaos
Author by

Chaos

Interested in Cloud Computing, Big Data processing, Virtual Machines, and Operating Systems

Updated on June 04, 2022

Comments

  • Chaos
    Chaos almost 2 years

    I'm trying to run a hive action through Oozie. My workflow.xml is as follows:

    <workflow-app name='edu-apollogrp-dfe' xmlns="uri:oozie:workflow:0.1">
        <start to="HiveEvent"/>
        <action name="HiveEvent">
                <hive xmlns="uri:oozie:hive-action:0.2">
                        <job-tracker>${jobTracker}</job-tracker>
                        <name-node>${nameNode}</name-node>
                        <configuration>
                                <property>
                                        <name>oozie.hive.defaults</name>
                                        <value>${hiveConfigDefaultXml}</value>
                                </property>
                        </configuration>
                        <script>${hiveQuery}</script>
                        <param>OUTPUT=${StagingDir}</param>
                </hive>
    
                <ok to="end"/>
                <error to="end"/>
        </action>
    
        <kill name='kill'>
                        <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
        </kill>
        <end name='end'/>
    

    And here is my job.properties file:

    oozie.wf.application.path=${nameNode}/user/${user.name}/hiveQuery
    oozie.libpath=${nameNode}/user/${user.name}/hiveQuery/lib
    queueName=interactive
    
    #QA
    nameNode=hdfs://hdfs.bravo.hadoop.apollogrp.edu
    jobTracker=mapred.bravo.hadoop.apollogrp.edu:8021
    
    # Hive
    
    hiveConfigDefaultXml=/etc/hive/conf/hive-default.xml
    
    hiveQuery=hiveQuery.hql
    StagingDir=${nameNode}/user/${user.name}/hiveQuery/Output
    

    When I run this workflow, I end up with this error:

    ACTION[0126944-130726213131121-oozie-oozi-W@HiveEvent] Launcher exception: org/apache/hadoop/hive/cli/CliDriver
    java.lang.NoClassDefFoundError: org/apache/hadoop/hive/cli/CliDriver
    

    Error Code: JA018

    Error Message: org/apache/hadoop/hive/cli/CliDriver

    I'm not sure what this error means. Where am I going wrong?

    EDIT

    This link says error code JA018 is: JA018 is output directory exists error in workflow map-reduce action. But in my case the output directory does not exist. This makes it all the more confusing