Efficiently creating additional columns in a pandas DataFrame using .map()
17,085
You can use applymap
with the dictionary get
method:
In [11]: df[abc_columns].applymap(categories.get)
Out[11]:
abc1 abc2 abc3
0 Good Bad Bad
1 Bad Good Good
2 Bad Bad Good
3 Good Bad Good
4 Good Good Bad
And put this to the specified columns:
In [12]: abc_categories = map(lambda x: x + '_category', abc_columns)
In [13]: abc_categories
Out[13]: ['abc1_category', 'abc2_category', 'abc3_category']
In [14]: df[abc_categories] = df[abc_columns].applymap(categories.get)
Note: you can construct abc_columns
relatively efficiently using a list comprehension:
abc_columns = [col for col in df.columns if str(col).startswith('abc')]
Related videos on Youtube
Comments
-
Daniel Romero over 1 year
I am analyzing a data set that is similar in shape to the following example. I have two different types of data (abc data and xyz data):
abc1 abc2 abc3 xyz1 xyz2 xyz3 0 1 2 2 2 1 2 1 2 1 1 2 1 1 2 2 2 1 2 2 2 3 1 2 1 1 1 1 4 1 1 2 1 2 1
I want to create a function that adds a categorizing column for each abc column that exists in the dataframe. Using lists of column names and a category mapping dictionary, I was able to get my desired result.
abc_columns = ['abc1', 'abc2', 'abc3'] xyz_columns = ['xyz1', 'xyz2', 'xyz3'] abc_category_columns = ['abc1_category', 'abc2_category', 'abc3_category'] categories = {1: 'Good', 2: 'Bad', 3: 'Ugly'} for i in range(len(abc_category_columns)): df3[abc_category_columns[i]] = df3[abc_columns[i]].map(categories) print df3
The end result:
abc1 abc2 abc3 xyz1 xyz2 xyz3 abc1_category abc2_category abc3_category 0 1 2 2 2 1 2 Good Bad Bad 1 2 1 1 2 1 1 Bad Good Good 2 2 2 1 2 2 2 Bad Bad Good 3 1 2 1 1 1 1 Good Bad Good 4 1 1 2 1 2 1 Good Good Bad
While the
for
loop at the end works fine, I feel like I should be using Python'slambda
function, but can't seem to figure it out.Is there a more efficient way to map in a dynamic number of abc-type columns?
-
yoshiserry almost 9 years@AndyHayden, what is the difference between .applymap on a dataframe and .map on a pandas dataframe?
-
Andy Hayden almost 9 years@yoshiserry applymap does it to each cell, rather than each row/col.
-
yoshiserry almost 9 years@AndyHayden I'm not sure what you mean, so ApplyMap applies the function to every cell (being every intersection of row and column) so basically across the entire dataframe. Whereas .map just does it for a single row or a single column?
-
Andy Hayden almost 9 years@yoshiserry yup. (and .apply is basically the same as .map but you'll see it used more often.)