Ways to Plot Spark Dataframe without Converting it to Pandas

21,472

The display function is only available in databricks kernel notebook, not in spark

Share:
21,472
Admin
Author by

Admin

Updated on January 17, 2020

Comments

  • Admin
    Admin over 4 years

    Is there any way to plot information from Spark dataframe without converting the dataframe to pandas?

    Did some online research but can't seem to find a way. I need to automatically save these plots as .pdf, so using the built-in visualization tool from databricks would not work.

    Right now, this is what I'm doing (as an example):

    # df = some Spark data frame 
    df = df.toPandas()
    df.plot()
    display(plt.show())
    

    I want to produce line graphs, histograms, bar charts and scatter plots without converting my dataframe to pandas dataframe. Thank you!

  • DieterDP
    DieterDP over 4 years
    This does not seem to work for me in Jupyter notebooks. Is this answer specifically for Databricks notebooks?
  • Sander Vanden Hautte
    Sander Vanden Hautte over 2 years
    Yes, it is for Databricks only.