`'Column' object is not callable` when showing a single spark column
21,646
Solution 1
Try
So spDF.select('colname').show()
Solution 2
You could also try:
import pyspark
from pyspark.sql import SparkSession
sc = pyspark.SparkContext('local[*]')
spark = SparkSession.builder.getOrCreate()
.
.
.
spDF.createOrReplaceTempView("space")
spark.sql("SELECT name FROM space").show()
The top two lines are optional to someone to try this snippet in local machine.
Related videos on Youtube
Comments
-
Nabih Bawazir about 3 years
I'm a new spark user, and previously I'm from pandas background. Here's my Spark Dataframe
In[75]: spDF Out[75]: DataFrame[customer_id: string, name: string]
and when I
show
themIn[75]: spDF.show() Out[75]: +-----------+-----------+ |customer_id| name| +-----------+-----------+ | 25620| MCDonnalds| | 25620| STARBUCKS| | 25620| nan| | 25620| nan| | 25620| MCDonnalds| | 25620| nan| | 25620| MCDonnalds| | 25620|DUNKINDONUT| | 25620| LOTTERIA| | 25620| nan| | 25620| MCDonnalds| | 25620|DUNKINDONUT| | 25620|DUNKINDONUT| | 25620| nan| | 25620| nan| | 25620| nan| | 25620| nan| | 25620| LOTTERIA| | 25620| LOTTERIA| | 25620| STARBUCKS| +-----------+-----------+ only showing top 20 rows
Then I try querying columns only
In[76]: spDF['name'] Out[76]: Column<b'name'>
But when I show them, I get the following error.
In[79]: spDF['name'].show() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-79-f6676d5e5ca2> in <module>() ----> 1 spDF['name'].show() TypeError: 'Column' object is not callable
Anyone has an idea, what is this error?