Python / Pyspark - Count NULL, empty and NaN

23,179

isnan is not a method belonging to the Column class, you need to import it:

from pyspark.sql.functions import isnan

And use it like:

df.filter((df["ID"] == "") | df["ID"].isNull() | isnan(df["ID"])).count()
Share:
23,179
qwertz
Author by

qwertz

Updated on January 13, 2020

Comments

  • qwertz
    qwertz over 4 years

    i want to count NULL, empty and NaN values in a column. I tried it like this:

    df.filter( (df["ID"] == "") | (df["ID"].isNull()) | ( df["ID"].isnan()) ).count()
    

    But i always get this error message:

    TypeError: 'Column' object is not callable
    

    Does anyone have an idea what might be the problem?

    Many thanks in advance!

  • qwertz
    qwertz over 6 years
    Do you have an ideao how i can check multiple columns in this query? df["Col1, Col2, Col3"] == ""
  • Psidom
    Psidom over 6 years
    One possibility is to use functools.reduce, see the answer I made here.