spark RDD sort by two values
20,528
You can try make an RDD
of key value where key will be Tuple
composed from rank
and popularity
and value will be name
and sort by the key.
For example:
// _._1 - name
// _._2 - popularity
// _._3 - rank
var tupledRDD = myRDD.map(line => ((line._3, line._2), line._1))
.sortBy(_._1, ascending=false)
.take(10)
Author by
safat siddiqui
Updated on April 13, 2020Comments
-
safat siddiqui about 4 years
I have a
RDD
of(name:String, popularity:Int, rank:Int)
. I want to sort this byrank
and ifrank
matches then bypopularity
. I am doing so by two transformations.var result = myRDD .sortBy(_._2, ascending = false) .sortBy(_._3, ascending = false) .take(10)
Can I do the it in one transformation?