How to configure Hive to use Spark?

27,408

Solution 1

change in hive configuration properties like this....

in $HIVE_HOME/conf/hive-site.xml

<property>
  <name>hive.execution.engine</name>
  <value>spark</value>
  <description>
    Chooses execution engine.
  </description>
</property>

Solution 2

set hive.execution.engine=spark;

try this command it will run fine.

Solution 3

set hive.execution.engine=spark; This is introduced in Hive 1.1+ onward. I think your hive version is older than Hive 1.1.

enter image description here Resource: https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started

Share:
27,408
Baeumla
Author by

Baeumla

IT Security @ Technische Universität Darmstadt

Updated on October 18, 2020

Comments

  • Baeumla
    Baeumla over 3 years

    I have a problem using Hive on Spark. I have installed a single-node HDP 2.1 (Hadoop 2.4) via Ambari on my CentOS 6.5. I'm trying to run Hive on Spark, so I used this instructions:

    https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started

    I already downloaded the "Prebuilt for Hadoop 2.4"-version of Spark, which i found on the official Apache Spark website. So I started the Master with:

    ./spark-class org.apache.spark.deploy.master.Master
    

    Then the worker with:

    ./spark-class org.apache.spark.deploy.worker.Worker spark://hadoop.hortonworks:7077
    

    And then I started Hive with this prompt:

    hive –-auxpath /SharedFiles/spark-1.0.1-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar
    

    Then, according to the instructions, i had to change the execution engine of hive to spark with this prompt:

    set hive.execution.engine=spark;,
    

    And the result is:

    Query returned non-zero code: 1, cause: 'SET hive.execution.engine=spark' FAILED in validation : Invalid value.. expects one of [mr, tez].
    

    So if I try to launch a simple Hive Query, I can see on my hadoop.hortonwork:8088 that the launched job is a MapReduce-Job.

    Now to my question: How can I change the execution engine of Hive so that Hive uses Spark instead of MapReduce? Are there any other ways to change it? (I already tried to change it via ambari and at the hive-site.xml)