Spark Dataframes: How can I change the order of columns in Java/Scala?
Solution 1
In Scala you can use the "splat" (:_*
) syntax to pass a variable length list of columns to the DataFrame.select()
method.
To address your example, you can get a list of the existing columns via DataFrame.columns
, which returns an array of strings. Then just sort that array and convert the values to columns. You can then "splat" out to the select()
method:
val mySortedCols = myDF.columns.sorted.map(str => col(str))
// Array[String]=(b,a,c,d,e) => Array[Column]=(a,b,c,d,e)
val myNewDF = myDF.select(mySortedCols:_*)
Solution 2
One way of doing it is reordering after your join:
case class Person(name : String, age: Int)
val persons = Seq(Person("test", 10)).toDF
persons.show
+----+---+
|name|age|
+----+---+
|test| 10|
+----+---+
persons.select("age", "name").show
+---+----+
|age|name|
+---+----+
| 10|test|
+---+----+
jest jest
Updated on June 04, 2022Comments
-
jest jest almost 2 years
After joining two dataframes, I find that the column order has changed what I supposed it would be.
Ex: Joining two data frames with columns
[b,c,d,e]
and[a,b]
onb
yields a column order of[b,a,c,d,e]
.How can I change the order of the columns (e.g.,
[a,b,c,d,e]
)? I've found ways to do it in Python/R but not Scala or Java. Are there any methods that allow swapping or reordering of dataframe columns?