Creating a new column in Panda by using lambda function on two existing columns
73,506
You can use function map and select by function np.where
more info
print df
# a b
#0 aaa rrrr
#1 bb k
#2 ccc e
#condition if condition is True then len column a else column b
df['c'] = np.where(df['a'].map(len) > df['b'].map(len), df['a'].map(len), df['b'].map(len))
print df
# a b c
#0 aaa rrrr 4
#1 bb k 2
#2 ccc e 3
Next solution is with function apply with parameter axis=1
:
axis = 1 or ‘columns’: apply function to each row
df['c'] = df.apply(lambda x: max(len(x['a']), len(x['b'])), axis=1)
Related videos on Youtube
Author by
piyush sharma
Updated on July 09, 2022Comments
-
piyush sharma over 1 year
I am able to add a new column in Panda by defining user function and then using apply. However, I want to do this using lambda; is there a way around?
For Example,
df
has two columnsa
andb
. I want to create a new columnc
which is equal to the longest length betweena
andb
.Some thing like:
df['c'] = df.apply(lambda x, len(df['a']) if len(df['a']) > len(df['b']) or len(df['b']) )
One approach:
df = pd.DataFrame({'a':['dfg','f','fff','fgrf','fghj'], 'b' : ['sd','dfg','edr','df','fghjky']}) df['c'] = df.apply(lambda x: max([len(x) for x in [df['a'], df['b']]])) print df a b c 0 dfg sd NaN 1 f dfg NaN 2 fff edr NaN 3 fgrf df NaN 4 fghj fghjky NaN
-
Lev Levitsky almost 8 yearsThis will work once you fix the syntax errors.
lambda x
needs a colon after it, and your expression lackselse
(maybe it should go instead ofor
). -
piyush sharma almost 8 yearsThanks for the quick response, however it still not work. Here is the code and error message. I will appreciate if you can provide any help. df = pd.DataFrame({'a':['dfg','f','fff','fgrf','fghj'], 'b' : ['sd','dfg','edr','df','fghjky']}) df['c'] = df.apply(lambda x: len(x['a']) if len(x['a']) > len(x['b']) else len(x['b'])) KeyError: ('a', u'occurred at index a')
-
Lev Levitsky almost 8 yearsPlease don't put code in comments, edit the question instead.
-
piyush sharma almost 8 yearsSorry this is my first time here. I try to edit my question but still its not coming in a nice formatted way
-
Lev Levitsky almost 8 yearsIn the edit mode, there is a button that opens formatting help. First off, you can select the code and press Ctrl-K, that will indent it by 4 spaces.
-
-
piyush sharma almost 8 yearsMap might works but mainly I am looking for a way to use Lambda with two columns and create a new column if possible
-
jezrael almost 8 yearsWhy do you want use lambda?
-
piyush sharma almost 8 yearsThe reason for using lambda is less typing and for me the code is more readable
-
Fed over 3 yearsFor future readers, the mistake was thus forgetting the axis = 1 (which caused the KeyError 'a' as we were iterating through the row_indexer [0,1,2,3,4]) instead of df['a'], df['b']. And also Jezraels Solution#2 is a bit neater, since lambda already loops through the rows.