Using lambda if condition on different columns in Pandas dataframe
49,754
Solution 1
is that what you want?
In [300]: frame[['b','c']].apply(lambda x: x['c'] if x['c']>0 else x['b'], axis=1)
Out[300]:
0 -1.099891
1 0.582815
2 0.901591
3 0.900856
dtype: float64
Solution 2
Solution
use a vectorized approach
frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)
Explanation
This is derived from the sum of
(frame.c > 0) * frame.c # frame.c if positive
Plus
(frame.c <= 0) * frame.b # frame.b if c is not positive
However
(frame.c <=0 )
is equivalent to
(1 - frame.c > 0)
and when combined you get
frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)
Author by
PeterL
Updated on July 09, 2022Comments
-
PeterL almost 2 years
I have simple dataframe:
import pandas as pd frame = pd.DataFrame(np.random.randn(4, 3), columns=list('abc'))
Thus for example:
a b c 0 -0.813530 -1.291862 1.330320 1 -1.066475 0.624504 1.690770 2 1.330330 -0.675750 -1.123389 3 0.400109 -1.224936 -1.704173
And then I want to create column “d” that contains value from “c” if c is positive. Else value from “b”.
I am trying:
frame['d']=frame.apply(lambda x: frame['c'] if frame['c']>0 else frame['b'],axis=0)
But getting “ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index a')
I was trying to google how to solve this, but did not succeed. Any tip please?