Convert a row to a list in spark scala

16,078

Solution 1

You can use toSeq method on the Row and then convert the type from Seq[Any] to Seq[Double](if you are sure the data types of all the columns are Double):

val df = Seq((1.0,2.0),(2.1,2.2)).toDF("A", "B")
// df: org.apache.spark.sql.DataFrame = [A: double, B: double]

df.show
+---+---+
|  A|  B|
+---+---+
|1.0|2.0|
|2.1|2.2|
+---+---+

df.first.toSeq.asInstanceOf[Seq[Double]]
// res1: Seq[Double] = WrappedArray(1.0, 2.0)

In case you have String type columns, use toSeq and then use map with pattern matching to convert the String to Double:

val df = Seq((1.0,"2.0"),(2.1,"2.2")).toDF("A", "B")
// df: org.apache.spark.sql.DataFrame = [A: double, B: string]

df.first.toSeq.map{ 
    case x: String => x.toDouble
    case x: Double => x 
}
// res3: Seq[Double] = ArrayBuffer(1.0, 2.0)

Solution 2

If you have a dataframe with doubles which you want to convert into List of doubles, then just convert the dataframe into rdd which will give you RDD[Row] you can covert this to List as

dataframe.rdd.map(_.toSeq.toList)

You will get list of doubles

Share:
16,078
Mr.cysl
Author by

Mr.cysl

Updated on June 04, 2022

Comments

  • Mr.cysl
    Mr.cysl almost 2 years

    Is that possible to do that? All the data in my dataframe(~1000 cols) are Doubles and I'm wandering whether I could turn a row of data to a list of Doubles?