How to turn off scientific notation in pyspark?

apache-spark pyspark apache-spark-sql spark-dataframe

19,652

The easiest way is to cast double column to decimal, giving appropriate precision and scale:

df.withColumn('total_sale_volume', df.total_sale_volume.cast(DecimalType(18, 2)))

19,652

Author by

chessosapiens

Updated on June 16, 2022

Comments

chessosapiens almost 2 years

As the result of some aggregation i come up with following sparkdataframe:

 ------------+-----------------+-----------------+
|sale_user_id|gross_profit     |total_sale_volume|
+------------+-----------------+-----------------+
|       20569|       -3322960.0|     2.12569482E8|
|       24269|       -1876253.0|      8.6424626E7|
|        9583|              0.0|       1.282272E7|
|       11722|          18229.0|        5653149.0|
|       37982|           6077.0|        1181243.0|
|       20428|           1665.0|        7011588.0|
|       41157|          73227.0|        1.18631E7|
|        9993|              0.0|        1481437.0|
|        9030|           8865.0|      4.4133791E7|
|         829|              0.0|          11355.0|
+------------+-----------------+-----------------+

and the schema of the dataframe is :

root
 |-- sale_user_id: string (nullable = true)
 |-- tapp_gross_profit: double (nullable = true)
 |-- total_sale_volume: double (nullable = true)

how can i disable scientific notation in each of gross_profit and total_sale_volume columns?

Bruno Ambrozio over 3 years

Any idea on how to do that without informing the number of decimal places (exponents)? I mean, make it to be inferred?
Mariusz over 3 years

@BrunoAmbrozio You can always .collect() a dataframe, and then you have a pure python objects with more control on how these are printed (stackoverflow.com/questions/658763/…)
Bruno Ambrozio over 3 years

Right now I need pretty much the same but for persisting the values in a file, however, I cannot set the precision. Appreciate if someone has a solution. Here's the new question: stackoverflow.com/questions/64772851/…
sabacherli over 2 years

DecimalType is also subject to scientific notation, depending on the precision and scale.
Tim Gautier about 2 years

DecimalType isn't deprecated in spark 3.0+
samkart almost 2 years

see this for DecimalType in spark 3.0+