Merging pandas columns (one-to-many)

15,957

You can use first groupby with join:

df2 = df2.groupby('ID')['Tag'].apply(', '.join).reset_index()
print (df2)
        ID                     Tag
0  3763058     item1, item2, item3
1  3763077  item_4, item_5, item_6

Then is possible use merge, especially if df1 has more columns:

df = pd.merge(df1, df2, on='ID', how='left')
print (df)
        ID  Name                     Tag
0  3763058  Andi     item1, item2, item3
1  3763077  Mark  item_4, item_5, item_6

Solution with map, if need add only one column:

df2 = df2.groupby('ID')['Tag'].apply(', '.join).reset_index()
df2['Name'] = df2['ID'].map(df1.set_index('ID')['Name'])
print (df2)
        ID                     Tag  Name
0  3763058     item1, item2, item3  Andi
1  3763077  item_4, item_5, item_6  Mark

If important position of Name column use insert:

df2 = df2.groupby('ID')['Tag'].apply(', '.join).reset_index()
df2.insert(1, 'Name', df2['ID'].map(df1.set_index('ID')['Name']))
print (df2)
        ID  Name                     Tag
0  3763058  Andi     item1, item2, item3
1  3763077  Mark  item_4, item_5, item_6
Share:
15,957

Related videos on Youtube

Andi Maier
Author by

Andi Maier

Updated on September 14, 2022

Comments

  • Andi Maier
    Andi Maier over 1 year

    I am new to Python’s Pandas. I want to combine several Excel sheets by a common ID. Besides, there it is a one-to-many relationship.

    Here are the inputs:

    df1:

    ID Name
    3763058 Andi
    3763077 Mark

    df2:

    ID Tag
    3763058 item1
    3763058 item2
    3763058 item3
    3763077 item4
    3763077 item5
    3763077 item6

    I would now like to merge the two pandas data frames df1 and df2 into the following output (the column tag is merged in a single column per ID):

    ID Name Tag
    3763058 Andi item1, item2, item3
    3763077 Mark item4, item5, item6

    Could anybody please help me with this?

    Cheers, Andi