Spark map dataframe using the dataframe's schema

19,840

Well, you can but result is rather useless:

val df = Seq(("Justin", 19, "red")).toDF("name", "age", "color")

def getValues(row: Row, names: Seq[String]) = names.map(
  name => name -> row.getAs[Any](name)
).toMap

val names = df.columns
df.rdd.map(getValues(_, names)).first

// scala.collection.immutable.Map[String,Any] = 
//   Map(name -> Justin, age -> 19, color -> red)

To get something actually useful one would a proper mapping between SQL types and Scala types. It is not hard in simple cases but it is hard in general. For example there is built-in type which can be used to represent an arbitrary struct. This can be done using a little bit of meta-programming but arguably it is not worth all the fuss.

19,840

Havnar

Raspberry Pi fan and Big Data padawan.

Updated on June 14, 2022

Comments

Havnar almost 2 years

I have a dataframe, created from a JSON object. I can query this dataframe and write it to parquet.

Since I infer the schema, I don't necessarily know what's in the dataframe.

Is there a way to the the column names out or map the dataframe using its own schema?

// The results of SQL queries are DataFrames and support all the normal  RDD operations.
// The columns of a row in the result can be accessed by field index:
df.map(t => "Name: " + t(0)).collect().foreach(println)

// or by field name:
df.map(t => "Name: " + t.getAs[String]("name")).collect().foreach(println)

// row.getValuesMap[T] retrieves multiple columns at once into a Map[String, T]
df.map(_.getValuesMap[Any](List("name", "age"))).collect().foreach(println)
// Map("name" -> "Justin", "age" -> 19)

I would want to do something like

df.map (_.getValuesMap[Any](ListAll())).collect().foreach(println)
// Map ("name" -> "Justin", "age" -> 19, "color" -> "red")

without knowing the actual amount or names of the columns.

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Spark map dataframe using the dataframe's schema

Related videos on Youtube

Havnar

Comments

Recents

Related