SPARK : failure: ``union'' expected but `(' found

11,443

Solution 1

Spark 2.0+

Spark 2.0 introduces native implementation of window functions (SPARK-8641) so HiveContext should be no longer required. Nevertheless similar errors, not related to window functions, can be still attributed to the differences between SQL parsers.

Spark <= 1.6

Window functions have been introduced in Spark 1.4.0 and require HiveContext to work. SQLContext won't work here.

Be sure you you use Spark >= 1.4.0 and create the HiveContext:

import org.apache.spark.sql.hive.HiveContext
val sqlContext = new HiveContext(sc)

Solution 2

Yes It is true,

I am using spark version 1.6.0 and there you need a HiveContext to implement 'dense_rank' method.

From Spark 2.0.0 on words there will be no more 'dense_rank' method.

So for Spark 1.4,1.6 <2.0 you should apply like this.

table hive_employees having three fields :: place : String, name : String, salary : Int

val conf = new SparkConf().setAppName("denseRank test")//.setMaster("local")

val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val hqlContext = new org.apache.spark.sql.hive.HiveContext(sc) 

val result = hqlContext.sql("select empid,empname, dense_rank() over(partition by empsalary order by empname) as rank from hive_employees")

result.show()

Share:
11,443
user1735076
Author by

user1735076

Updated on July 18, 2022

Comments

  • user1735076
    user1735076 almost 2 years

    I have a dataframe called df with column named employee_id. I am doing:

     df.registerTempTable("d_f")
    val query = """SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f"""
    val result = Spark.getSqlContext().sql(query)
    

    But getting following issue. Any help?

    [1.29] failure: ``union'' expected but `(' found
    SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f
                                ^
    java.lang.RuntimeException: [1.29] failure: ``union'' expected but `(' found
    SELECT *, ROW_NUMBER() OVER (ORDER BY employee_id) row_number FROM d_f
    
  • Daniel Darabos
    Daniel Darabos about 8 years
    sc is SparkContext.
  • Daniel Darabos
    Daniel Darabos about 8 years
    But why do window functions need a HiveContext? What is the difference between HiveContext and SQLContext?
  • zero323
    zero323 about 8 years
    @DanielDarabos In this particular case it is simply about the support for Hive UDAFs. All window functions in Spark < 2.0.0 are expressed using Hive UDAF, hence cannot work without HiveContext.
  • Daniel Darabos
    Daniel Darabos about 8 years
    I see, thanks! Thanks for updating the answer too. I've added a link to issues.apache.org/jira/browse/SPARK-8641.