Merging pandas columns (one-to-many)
15,957
You can use first groupby
with join
:
df2 = df2.groupby('ID')['Tag'].apply(', '.join).reset_index()
print (df2)
ID Tag
0 3763058 item1, item2, item3
1 3763077 item_4, item_5, item_6
Then is possible use merge
, especially if df1
has more columns:
df = pd.merge(df1, df2, on='ID', how='left')
print (df)
ID Name Tag
0 3763058 Andi item1, item2, item3
1 3763077 Mark item_4, item_5, item_6
Solution with map
, if need add only one column:
df2 = df2.groupby('ID')['Tag'].apply(', '.join).reset_index()
df2['Name'] = df2['ID'].map(df1.set_index('ID')['Name'])
print (df2)
ID Tag Name
0 3763058 item1, item2, item3 Andi
1 3763077 item_4, item_5, item_6 Mark
If important position of Name
column use insert
:
df2 = df2.groupby('ID')['Tag'].apply(', '.join).reset_index()
df2.insert(1, 'Name', df2['ID'].map(df1.set_index('ID')['Name']))
print (df2)
ID Name Tag
0 3763058 Andi item1, item2, item3
1 3763077 Mark item_4, item_5, item_6
Related videos on Youtube
Author by
Andi Maier
Updated on September 14, 2022Comments
-
Andi Maier over 1 year
I am new to Python’s Pandas. I want to combine several Excel sheets by a common ID. Besides, there it is a one-to-many relationship.
Here are the inputs:
df1:
ID Name 3763058 Andi 3763077 Mark df2:
ID Tag 3763058 item1 3763058 item2 3763058 item3 3763077 item4 3763077 item5 3763077 item6 I would now like to merge the two pandas data frames df1 and df2 into the following output (the column tag is merged in a single column per ID):
ID Name Tag 3763058 Andi item1, item2, item3 3763077 Mark item4, item5, item6 Could anybody please help me with this?
Cheers, Andi