How to write summary of spark sql dataframe to excel file

14,177

The return type for describe is a pyspark dataframe. The easiest way to get the describe dataframe into an excel readable format is to convert it to a pandas dataframe and then write the pandas dataframe out as a csv file as below

import pandas
df.describe().toPandas().to_csv('fileOutput.csv')

If you want it in excel format, you can try below

import pandas
df.describe().toPandas().to_excel('fileOutput.xls', sheet_name = 'Sheet1', index = False)

Note, the above requires xlwt package to be installed (pip install xlwt in the command line)

Share:
14,177
Ajg
Author by

Ajg

Updated on June 13, 2022

Comments

  • Ajg
    Ajg almost 2 years

    I have a very large Dataframe with 8000 columns and 50000 rows. I want to write its statistics information into excel file. I think we can use describe() method. But how to write it to excel in good format. Thanks