Pandas Filter function returned a Series, but expected a scalar bool

11,470

it should be:

In [32]: grouped = df.groupby("student_id")

In [33]: grouped.filter(lambda x: x["student_id"].count()==1)

Updates:

i'm not sure about the issue u mentioned regarding the interactive console. technically speaking in this particular case (there might be other situations such as the intricate "import" functionality in which diff env may behave differently), the console (such as ipython) should behave the same as other environment (orig python env, or some IDE embedded one)

an intuitive way to understand the pandas groupby is to treat the return obj of DataFrame.groupby() as a list of dataframe. so when u try to using filter to apply the lambda function upon x, x is actually one of those dataframes:

In[25]: df = pd.DataFrame(data,columns=year)

In[26]: df

Out[26]: 
   2013  2014
0     0     1
1     2     3
2     4     5
3     6     7
4     0     1
5     2     3
6     4     5
7     6     7

In[27]: grouped = df.groupby(2013)

In[28]: grouped.count()

Out[28]: 
      2014
2013      
0        2
2        2
4        2
6        2

in this example, the first dataframe in the grouped obj would be:

In[33]: df1 = df.ix[[0,4]]

In[34]: df1

Out[33]: 
   2013  2014
0     0     1
4     0     1
Share:
11,470
lathomas64
Author by

lathomas64

Software engineer struggling to try to build worlds and games in the free time. http://sscce.org/

Updated on July 20, 2022

Comments

  • lathomas64
    lathomas64 almost 2 years

    I am attempting to use filter on a pandas dataframe to filter out all rows that match a duplicate value(need to remove ALL the rows when there are duplicates, not just the first or last).

    This is what I have that works in the editor :

    df = df.groupby("student_id").filter(lambda x: x.count() == 1)
    

    But when I run my script with this code in it I get the error:

    TypeError: filter function returned a Series, but expected a scalar bool

    I am creating the dataframe by concatenating two other frames immediately before trying to apply the filter.