How to sort the data on multiple columns in apache spark scala?

scala apache-spark

13,552

Solution 1

Suppose your input RDD/DataFrame is called df.

To sort recent in descending order, Freq and Monitor both in ascending you can do:

import org.apache.spark.sql.functions._

val sorted = df.sort(desc("recent"), asc("Freq"), asc("Monitor"))

You can use df.orderBy(...) as well, it's an alias of sort().

Solution 2

csv.sortBy(r => (r.recent, r.freq)) or equivalent should do it

13,552

Related videos on Youtube

Author by

Niranjanp

Updated on September 15, 2022

Comments

Niranjanp over 1 year

I have data set like this which I am taking from csv file and converting it into RDD using scala.

+-----------+-----------+----------+
| recent    | Freq      | Monitor  |
+-----------+-----------+----------+
|        1  |       1234 |   199090|
|        4  |       2553|    198613|
|        6  |       3232 |   199090|
|        1  |       8823 |   498831|
|        7  |       2902 |   890000|
|        8  |       7991 |   081097|
|        9  |       7391 |   432370|
|        12 |       6138 |   864981|
|        7  |       6812 |   749821|
+-----------+-----------+----------+

How to sort the data on all columns ?

Thanks

Tzach Zohar

Possible duplicate of Sorting by multiple fields in Apache Spark

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Related

Spark - Scala - Number of days between two dates

How to find the max String length of a column in Spark using dataframe?

Check equality for two Spark DataFrames in Scala

How to spark-submit with main class in jar?

Difference between two rows in Spark dataframe

overwrite hive partitions using spark

Dataframe: how to groupBy/count then order by count in Scala

Spark: Efficient mass lookup in pair RDD's

ERROR yarn.ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException java.lang.reflect.InvocationTargetException

How do I increase decimal precision in Spark?