Using lambda if condition on different columns in Pandas dataframe

49,754

Solution 1

is that what you want?

In [300]: frame[['b','c']].apply(lambda x: x['c'] if x['c']>0 else x['b'], axis=1)
Out[300]:
0   -1.099891
1    0.582815
2    0.901591
3    0.900856
dtype: float64

Solution 2

Solution

use a vectorized approach

frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)

Explanation

This is derived from the sum of

(frame.c > 0) * frame.c  # frame.c if positive

Plus

(frame.c <= 0) * frame.b  # frame.b if c is not positive

However

(frame.c <=0 )

is equivalent to

(1 - frame.c > 0)

and when combined you get

frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)
Share:
49,754
PeterL
Author by

PeterL

Updated on July 09, 2022

Comments

  • PeterL
    PeterL almost 2 years

    I have simple dataframe:

    import pandas as pd
    frame = pd.DataFrame(np.random.randn(4, 3), columns=list('abc'))
    

    Thus for example:

    a   b   c
    0   -0.813530   -1.291862   1.330320
    1   -1.066475   0.624504    1.690770
    2   1.330330    -0.675750   -1.123389
    3   0.400109    -1.224936   -1.704173
    

    And then I want to create column “d” that contains value from “c” if c is positive. Else value from “b”.

    I am trying:

    frame['d']=frame.apply(lambda x: frame['c'] if frame['c']>0 else frame['b'],axis=0)
    

    But getting “ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index a')

    I was trying to google how to solve this, but did not succeed. Any tip please?