Count number of non-NaN entries in every column of Dataframe

python pandas dataframe count nan

121,009

Solution 1

The count() method returns the number of non-NaN values in each column:

>>> df1.count()
a    3
b    2
d    1
dtype: int64

Similarly, count(axis=1) returns the number of non-NaN values in each row.

If you want to sum the total count values which are not NAN, one can do;

np.sum(df.count())

In case you are dealing with empty strings you may want to count them as NA as well :

df.replace('', np.nan).count()

or if you also want to remove blank strings :

df.replace(r'^\s*$', np.nan, regex=True).count()

121,009

Updated on February 18, 2021

cryp about 3 years
I have a really big DataFrame and I was wondering if there was short (one or two liner) way to get the a count of non-NaN entries in a DataFrame. I don't want to do this one column at a time as I have close to 1000 columns.
```
df1 = pd.DataFrame([(1,2,None),(None,4,None),(5,None,7),(5,None,None)], 
                    columns=['a','b','d'], index = ['A', 'B','C','D'])

    a   b   d
A   1   2 NaN
B NaN   4 NaN
C   5 NaN   7
D   5 NaN NaN
```
Output:
```
a: 3
b: 2
d: 1
```
- cryp about 9 years
  
  df1[df1.notnull()].count() this seem to have worked
- Alex Riley about 9 years
  
  The extra indexing with df1.notnull() is not necessary since count ignores null values anyway.
- smci over 7 years
  
  Unlike series.value_counts(..., dropna=False), there is no option on df.count() to directly get NA counts.
DISC-O almost 3 years

I dont believe that works if the column has strings
Alex Riley almost 3 years

@DISC-O: just tried and it works for me (pandas version 1.2.1). E.g. df = pd.DataFrame({"a": ["x", np.nan, "z"]}) then df.count() produces the expected value 2. Do you have an example where this method does not work?
DISC-O almost 3 years

yes, if you manually create a df and place the np.nan it could work I guess. But that is not how you typically create your columns. One often used way, by me at least is: df['C'] =np.where(df.A>df.B,'some text',np.nan). This turns the np.nan into 'nan' and is no longer recognized as nan.