rank() function usage in Spark SQL

12,629

Window spec need to be specified for rank()

val w = org.apache.spark.sql.expressions.Window.orderBy("date") //some spec    

val leadDf = inputDSAAcolonly.withColumn("df1Rank", rank().over(w))

Edit: Java version of answer, as OP using Java

import org.apache.spark.sql.expressions.WindowSpec; 
WindowSpec w = org.apache.spark.sql.expressions.Window.orderBy(colName);
Dataset<Row> leadDf = inputDSAAcolonly.withColumn("df1Rank", rank().over(w));
Share:
12,629
Binu
Author by

Binu

Hi, I am a technology enthusiast.Currently working in Big Data Analytics. It's fun to learn new things . It's heartening to see a great community of contributors who help interested learners around to pick up skills and apply. Thanks and have a great day!

Updated on June 18, 2022

Comments

  • Binu
    Binu almost 2 years

    Need some pointers in using rank()

    I have extracted a column from a dataset..need to do the ranking.

    Dataset<Row> inputCol= inputDataset.apply("Colname");    
    Dataset<Row>  DSColAwithIndex=inputDSAAcolonly.withColumn("df1Rank", rank());
    
    DSColAwithIndex.show();
    

    I can sort the column and then append an index column too to get rank...but curious to known syntax and usage of rank()

  • Binu
    Binu about 7 years
    Thanks ...tried the below in java....worked fine... import org.apache.spark.sql.expressions.WindowSpec; WindowSpec w = org.apache.spark.sql.expressions.Window.orderBy(colName); Dataset<Row> leadDf = inputDSAAcolonly.withColumn("df1Rank", rank().over(w));