How do I pass parameters to selectExpr? SparkSQL-Scala

11,649

Can you try this.

val english = "hello"
    generar_informe(data,english).show()

  }

  def generar_informe(df: DataFrame , english : String)={
    df.selectExpr(
      "transactionId" , "customerId" , "itemId","amountPaid" , s"""'${english}' as saludo """)
  }

This is the output I got.

17/11/02 23:56:44 INFO CodeGenerator: Code generated in 13.857987 ms
+-------------+----------+------+----------+------+
|transactionId|customerId|itemId|amountPaid|saludo|
+-------------+----------+------+----------+------+
|          111|         1|     1|     100.0| hello|
|          112|         2|     2|     505.0| hello|
|          113|         3|     3|     510.0| hello|
|          114|         4|     4|     600.0| hello|
|          115|         1|     2|     500.0| hello|
|          116|         1|     2|     500.0| hello|
|          117|         1|     2|     500.0| hello|
|          118|         1|     2|     500.0| hello|
|          119|         2|     3|     500.0| hello|
|          120|         1|     2|     500.0| hello|
|          121|         1|     4|     500.0| hello|
|          122|         1|     2|     500.0| hello|
|          123|         1|     4|     500.0| hello|
|          124|         1|     2|     500.0| hello|
+-------------+----------+------+----------+------+

17/11/02 23:56:44 INFO SparkContext: Invoking stop() from shutdown hook
Share:
11,649
Borja
Author by

Borja

Updated on June 28, 2022

Comments

  • Borja
    Borja almost 2 years

    :)

    When you have a data frame, you can add columns and fill their rows with the method selectExprt

    Something like this:

    scala> table.show
    +------+--------+---------+--------+--------+
    |idempr|tipperrd| codperrd|tipperrt|codperrt|
    +------+--------+---------+--------+--------+
    |  OlcM|       h|999999999|       J|       0|
    |  zOcQ|       r|777777777|       J|       1|
    |  kyGp|       t|333333333|       J|       2|
    |  BEuX|       A|999999999|       F|       3|
    
    scala> var table2 = table.selectExpr("idempr", "tipperrd", "codperrd", "tipperrt", "codperrt", "'hola' as Saludo")
    tabla: org.apache.spark.sql.DataFrame = [idempr: string, tipperrd: string, codperrd: decimal(9,0), tipperrt: string, codperrt: decimal(9,0), Saludo: string]
    
    scala> table2.show
    +------+--------+---------+--------+--------+------+
    |idempr|tipperrd| codperrd|tipperrt|codperrt|Saludo|
    +------+--------+---------+--------+--------+------+
    |  OlcM|       h|999999999|       J|       0|  hola|
    |  zOcQ|       r|777777777|       J|       1|  hola|
    |  kyGp|       t|333333333|       J|       2|  hola|
    |  BEuX|       A|999999999|       F|       3|  hola|
    

    My point is:

    I define strings and call a method which use this String parameter to fill a column in the data frame. But I am not able to do the select expresion get the string (I tried $, +, etc..) . To achieve something like this:

    scala> var english = "hello"
    
    scala> def generar_informe(df: DataFrame, tabla: String) {
        var selectExpr_df = df.selectExpr(
          "TIPPERSCON_BAS as TIP.PERSONA CONTACTABILIDAD",
          "CODPERSCON_BAS as COD.PERSONA CONTACTABILIDAD",
          "'tabla' as PUNTO DEL FLUJO" )
    }
    
    scala> generar_informe(df,english)
    
    .....
    
    scala> table2.show
    +------+--------+---------+--------+--------+------+
    |idempr|tipperrd| codperrd|tipperrt|codperrt|Saludo|
    +------+--------+---------+--------+--------+------+
    |  OlcM|       h|999999999|       J|       0|  hello|
    |  zOcQ|       r|777777777|       J|       1|  hello|
    |  kyGp|       t|333333333|       J|       2|  hello|
    |  BEuX|       A|999999999|       F|       3|  hello|
    

    I tried:

    scala> var result = tabl.selectExpr("A", "B", "$tabla as C")
    
    scala> var abc = tabl.selectExpr("A", "B", ${tabla} as C)
        <console>:31: error: not found: value $
                 var abc = tabl.selectExpr("A", "B", ${tabla} as C)
    
    scala> var abc = tabl.selectExpr("A", "B", "${tabla} as C")
    
    scala> sqlContext.sql("set tabla='hello'")
    scala> var abc = tabl.selectExpr("A", "B", "${tabla} as C")
    

    SAME ERROR:

    java.lang.RuntimeException: [1.1] failure: identifier expected
    ${tabla} as C
    ^
        at scala.sys.package$.error(package.scala:27)
    

    Thanks in advance!

    • Achilleus
      Achilleus over 6 years
      I couldn't recreate this but I was wondering if "$tabla as PUNTO DEL FLUJO" didn't work? I believe you would have tried this for sure but still was curious.
    • Borja
      Borja over 6 years
      I answer you above in the question to format the code and see it better ;)