Sort Spark Dataframe with two columns in different order
18,895
Solution 1
Use Column method desc, as shown below:
val df = Seq(
(2,6), (1,2), (1,3), (1,5), (2,3)
).toDF("A", "B")
df.orderBy($"A", $"B".desc).show
// +---+---+
// | A| B|
// +---+---+
// | 1| 5|
// | 1| 3|
// | 1| 2|
// | 2| 6|
// | 2| 3|
// +---+---+
Solution 2
desc
is the correct method to use, however, not that it is a method in the Columnn
class. It should therefore be applied as follows:
df.orderBy($"A", $"B".desc)
$"B".desc
returns a column so "A"
must also be changed to $"A"
(or col("A")
if spark implicits isn't imported).
Author by
kello
Updated on June 12, 2022Comments
-
kello almost 2 years
Let's say, I have a table like this:
A,B 2,6 1,2 1,3 1,5 2,3
I want to sort it with ascending order for column
A
but within that I want to sort it in descending order of columnB
, like this:A,B 1,5 1,3 1,2 2,6 2,3
I have tried to use
orderBy("A", desc("B"))
but it gives an error.How should I write the query using dataframe in Spark 2.0?
-
Luis Miguel Mejía Suárez over 5 yearsI like to be as explicit as possible, so I would use the asc on the first column (
$"A".asc
), even if the default behavior is to sort ascending. -
wayneeusa about 5 yearsdf.orderBy($"A"desc, $"B".asc) solved my problem. Great.