python: pandas np.where vs. df.loc with multiple conditions
13,416
I think your boolean
are not string
s, so need remove '
:
df = pd.DataFrame({'Column_A': ['AAA','AAA','ABC','CDE'],
'checked': ['0','0','1','0'],
'duplicate': [True, True, False, False]})
df['flag'] = np.where((df['checked'] == 'Y') &(df['duplicate'] == True), 'Y', '0')
print (df)
Column_A checked duplicate flag
0 AAA 0 True 0
1 AAA 0 True 0
2 ABC 1 False 0
3 CDE 0 False 0
Or if compare with boolean
column, == True
can be omited:
df['flag'] = np.where((df['checked'] == 'Y') &(df['duplicate']), 'Y', '0')
print (df)
Column_A checked duplicate flag
0 AAA 0 True 0
1 AAA 0 True 0
2 ABC 1 False 0
3 CDE 0 False 0
Also if need check checked
need '
because strings
:
df['flag'] = np.where((df['checked'] == '0') &(df['duplicate'] == True), 'Y', '0')
print (df)
Column_A checked duplicate flag
0 AAA 0 True Y
1 AAA 0 True Y
2 ABC 1 False 0
3 CDE 0 False 0
EDIT:
Solution with loc
:
df['flag'] = '0'
mask = (df['checked'] == '0') &(df['duplicate'])
df.loc[mask, 'flag'] = 'Y'
print (df)
Column_A checked duplicate flag
0 AAA 0 True Y
1 AAA 0 True Y
2 ABC 1 False 0
3 CDE 0 False 0
Related videos on Youtube
Author by
jeangelj
BY DAY: Data Analyst BY NIGHT: Assassin's Creed & Tomb Raider player "The Matrix has you ..." - Trinity
Updated on October 13, 2022Comments
-
jeangelj over 1 year
Np.where has been giving me a lot of errors, so I am looking for a solution with df.loc instead.
This is the np.where error I have been getting:
C:\Users\xxx\AppData\Local\Continuum\Anaconda2\lib\site-packages\ipykernel\__main__.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy if __name__ == '__main__':
I am working with the following dataframe df:
df = pd.DataFrame({'Column_A': ['AAA','AAA','ABC','CDE'],'checked': ['0','0','1','0'],'duplicate': ['True','True','False','False']}) Column_A checked duplicate 0 AAA 0 True 1 AAA 0 True 2 ABC 1 False 3 CDE 0 False
I want to create an additional flag, if checked is 0 and duplicate is True.
I tried this and it didn't work:
df['flag'] = (np.where((df['checked'] == 'Y') &(df['duplicate'] == 'True'), 'Y', '0')) TypeError: invalid type comparison
I tried it with df.loc:
df['flag'] = (df.loc[df['checked'] == 'Y']& df.loc[df['duplicate'] == 'True'], 'Y','0') TypeError: invalid type comparison
and I get the same error!
-
jeangelj almost 7 yearsoh - I see! how woul dI do an OR statement instead of &?
-
jeangelj almost 7 yearsthank you very much - is there a way to do it with df.loc?
-
jezrael almost 7 yearsI think yes, it depends what you need.
&
is bitwiseand
and|
is bitwiseor
.