How to turn off scientific notation in pyspark?
19,652
The easiest way is to cast double column to decimal, giving appropriate precision and scale:
df.withColumn('total_sale_volume', df.total_sale_volume.cast(DecimalType(18, 2)))
Author by
chessosapiens
Updated on June 16, 2022Comments
-
chessosapiens almost 2 years
As the result of some aggregation i come up with following sparkdataframe:
------------+-----------------+-----------------+ |sale_user_id|gross_profit |total_sale_volume| +------------+-----------------+-----------------+ | 20569| -3322960.0| 2.12569482E8| | 24269| -1876253.0| 8.6424626E7| | 9583| 0.0| 1.282272E7| | 11722| 18229.0| 5653149.0| | 37982| 6077.0| 1181243.0| | 20428| 1665.0| 7011588.0| | 41157| 73227.0| 1.18631E7| | 9993| 0.0| 1481437.0| | 9030| 8865.0| 4.4133791E7| | 829| 0.0| 11355.0| +------------+-----------------+-----------------+
and the schema of the dataframe is :
root |-- sale_user_id: string (nullable = true) |-- tapp_gross_profit: double (nullable = true) |-- total_sale_volume: double (nullable = true)
how can i disable scientific notation in each of gross_profit and total_sale_volume columns?
-
Bruno Ambrozio over 3 yearsAny idea on how to do that without informing the number of decimal places (exponents)? I mean, make it to be inferred?
-
Mariusz over 3 years@BrunoAmbrozio You can always
.collect()
a dataframe, and then you have a pure python objects with more control on how these are printed (stackoverflow.com/questions/658763/…) -
Bruno Ambrozio over 3 yearsRight now I need pretty much the same but for persisting the values in a file, however, I cannot set the precision. Appreciate if someone has a solution. Here's the new question: stackoverflow.com/questions/64772851/…
-
sabacherli over 2 yearsDecimalType is also subject to scientific notation, depending on the precision and scale.
-
Tim Gautier about 2 yearsDecimalType isn't deprecated in spark 3.0+
-
samkart almost 2 yearssee this for DecimalType in spark 3.0+