replace column values in one dataframe by values of another dataframe

66,526

Solution 1

If you set the index to the 'Group' column on the other df then you can replace using map on your original df 'Group' column:

In [36]:
df['Group'] = df['Group'].map(df1.set_index('Group')['Hotel'])
df

Out[36]:
         Date  Group  Family  Bonus
0  2011-06-09  Jamel  Laavin    456
1  2011-07-09  Frank  Grendy    679
2  2011-09-10   Luxy  Fantol    431
3  2011-11-02  Frank  Gondow    569

Solution 2

You could also create a dictionary and use apply:

hotel_dict = df2.set_index('Group').to_dict()
df1['Group'] = df1['Group'].apply(lambda x: hotel_dict[x])

Solution 3

just use pandas join, you can refer to detail link: http://pandas.pydata.org/pandas-docs/stable/merging.html

df1.join(df2,on='Group')

Solution 4

This is an old question but here is another way to do it, it is not like the pandas way but is fast

Reproducing the dataframe 1 - this is to be updated

df_1

    Date    Group   Family  Bonus
0   2011-06-09  tri23_1     Laavin  456
1   2011-07-09  hsgç_T2     Grendy  679
2   2011-09-10  bbbj-1Y_jn  Fantol  431
3   2011-11-02  hsgç_T2     Gondow  569

Reproducing dataframe 2 - the look up

df_2

    Group   Hotel
0   tri23_1     Jamel
1   hsgç_T2     Frank
2   bbbj-1Y_jn  Luxy
3   mlkl_781    Grand Hotel
4   vchs_94     Vancouver

Get all the hotel id (key column) from the dataframe 1 as a list

key_list = list(df_1['Group'])

['tri23_1', 'hsgç_T2', 'bbbj-1Y_jn', 'hsgç_T2']

Create a dictionary from the look up dataframe which has the key col and the value col

dict_lookup = dict(zip(df_2['Group'], df_2['Hotel']))

{'bbbj-1Y_jn': 'Luxy',
 'hsgç_T2': 'Frank',
 'mlkl_781': 'Grand Hotel',
 'tri23_1': 'Jamel',
 'vchs_94': 'Vancouver'}

Replace the value by creating a list by looking up the value and assign to dataframe 1 column

df_1['Group'] = [dict_lookup[item] for item in key_list]

Updated dataframe 1

    Date    Group   Family  Bonus
0   2011-06-09  Jamel   Laavin  456
1   2011-07-09  Frank   Grendy  679
2   2011-09-10  Luxy    Fantol  431
3   2011-11-02  Frank   Gondow  569
Share:
66,526
Amanda
Author by

Amanda

Updated on August 05, 2022

Comments

  • Amanda
    Amanda over 1 year

    I have two dataframes, the first one has 1000 rows and looks like:

    Date            Group         Family       Bonus
    2011-06-09      tri23_1       Laavin       456
    2011-07-09      hsgç_T2       Grendy       679
    2011-09-10      bbbj-1Y_jn    Fantol       431
    2011-11-02      hsgç_T2       Gondow       569
    

    The column Group has different values, sometimes repeated, but in general about 50 unique values.

    The second dataframe contains all these 50 unique values (50 rows) and also the hotels, that are associated to these values:

    Group             Hotel
    tri23_1           Jamel
    hsgç_T2           Frank
    bbbj-1Y_jn        Luxy
    mlkl_781          Grand Hotel
    vchs_94           Vancouver
    

    My goal is to replace the value in the column Group of the first dataframe by the corresponding values of the column Hotel of the second dataframe/or create the column Hotel with the corresponding values. When I try to make it just by assignment like

    df1.loc[(df1.Group=df2.Group), 'Hotel']=df2.Hotel
    

    I have an error that the dataframes are not of equal size, so the comparison is not possible.