Pandas conditional subset for dataframe with bool values and ints
16,675
Solution 1
IIUC, starting from a sample dataframe like:
A,B,C
01,True,1
01,False,2
02,False,1
02,True,2
03,True,1
you can:
df = df[(df['C']==1) | (df['B']==True)]
which returns:
A B C
0 1 True 1
2 2 False 1
3 2 True 2
4 3 True 1
Solution 2
You've couple of methods for filtering, and performance varies based on size of your data
In [722]: df[(df['C']==1) | df['B']]
Out[722]:
A B C
0 1 True 1
2 2 False 1
3 2 True 2
4 3 True 1
In [723]: df.query('C==1 or B==True')
Out[723]:
A B C
0 1 True 1
2 2 False 1
3 2 True 2
4 3 True 1
In [724]: df[df.eval('C==1 or B==True')]
Out[724]:
A B C
0 1 True 1
2 2 False 1
3 2 True 2
4 3 True 1
Author by
Christopher Jenkins
Updated on June 05, 2022Comments
-
Christopher Jenkins almost 2 years
I have a dataframe with three series. Column A contains a group_id. Column B contains True or False. Column C contains a 1-n ranking (where n is the number of rows per group_id).
I'd like to store a subset of this dataframe for each row that:
1) Column C == 1 OR 2) Column B == True
The following logic copies my old dataframe row for row into the new dataframe:
new_df = df[df.column_b | df.column_c == 1]