How to write summary of spark sql dataframe to excel file
14,177
The return type for describe
is a pyspark dataframe. The easiest way to get the describe
dataframe into an excel readable format is to convert it to a pandas dataframe and then write the pandas dataframe out as a csv file as below
import pandas
df.describe().toPandas().to_csv('fileOutput.csv')
If you want it in excel format, you can try below
import pandas
df.describe().toPandas().to_excel('fileOutput.xls', sheet_name = 'Sheet1', index = False)
Note, the above requires xlwt package to be installed (pip install xlwt in the command line)
Author by
Ajg
Updated on June 13, 2022Comments
-
Ajg almost 2 years
I have a very large Dataframe with 8000 columns and 50000 rows. I want to write its statistics information into excel file. I think we can use
describe()
method. But how to write it to excel in good format. Thanks