Pyspark: spark data frame column width configuration in Jupyter Notebook

python apache-spark pyspark spark-dataframe jupyter-notebook

15,299

Solution 1

I don't think you can set a specific width, but this will ensure your data is not cutoff no matter the size

my_df.select('field_1','field_2').show(10, truncate = False)

Solution 2

This should give you what you want

import pandas as pd
pd.set_option('display.max_colwidth', 80)
my_df.select('field_1','field_2').limit(100).toPandas()

15,299

Author by

Edamame

Updated on June 30, 2022

Comments

Edamame almost 2 years
I have the following code in Jupyter Notebook:
```
import pandas as pd
pd.set_option('display.max_colwidth', 80)
my_df.select('field_1','field_2').show()
```
I want to increase the column width so I could see the full value of field_1 and field_2. I know we can use pd.set_option('display.max_colwidth', 80) for pandas data frame, but it doesn't seem to work for spark data frame.

Is there a way to increase the column width for the spark data frame like what we did for pandas data frame? Thanks!

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Related

multi-processing with spark(PySpark)

Apply custom function to cells of selected columns of a data frame in PySpark

PySpark: Add a new column with a tuple created from columns

How divide or multiply every non-string columns of a PySpark dataframe with a float constant?

Spark/PySpark: An error occurred while trying to connect to the Java server (127.0.0.1:39543)

Pyspark: TaskMemoryManager: Failed to allocate a page: Need help in Error Analysis

How to add suffix and prefix to all columns in python/pyspark dataframe

PySpark: org.apache.spark.sql.AnalysisException: Attribute name ... contains invalid character(s) among " ,;{}()\n\t=". Please use alias to rename it

convert columns of pyspark data frame to lowercase

Pyspark - converting json string to DataFrame