self-join with Pandas
18,130
Solution 1
join
with on='i'
d.join(d.drop('i', 1), on='i', rsuffix='_y')
some_col i some_col_y
0 A 2 C
1 B 1 B
2 C 1 B
Solution 2
Instead of using merge
you can also use indexing and assignment:
>>> d['new_col'] = d['some_col'][d['i']].values
>>> d
some_col i new_col
0 A 2 C
1 B 1 B
2 C 1 B
Solution 3
Try this:
In [69]: d.join(d.set_index('i'), rsuffix='_y')
Out[69]:
some_col i some_col_y
0 A 2 NaN
1 B 1 B
1 B 1 C
2 C 1 A
or:
In [64]: pd.merge(d[['some_col']], d, left_index=True, right_on='i', suffixes=['_y','']).sort_index()
Out[64]:
some_col_y some_col i
0 C A 2
1 B B 1
2 B C 1
Author by
Nucular
Updated on June 14, 2022Comments
-
Nucular almost 2 years
I would like to perform a self-join on a Pandas dataframe so that some rows get appended to the original rows. Each row has a marker 'i' indicating which row should get appended to it on the right.
d = pd.DataFrame(['A','B','C'], columns = ['some_col']) d['i'] = [2,1,1] In [17]: d Out[17]: some_col i 0 A 2 1 B 1 2 C 1
Desired output:
some_col i some_col_y 0 A 2 C 1 B 1 B 2 C 1 B
That is, row 2 gets appended to row 0, row 1 to row 1, row 1 to row 2 (as indicated by i).
My idea of how to go about it was
pd.merge(d, d, left_index = True, right_on = 'i', how = 'left')
But it produces something else altogether. How to do it correctly?
-
MaxU - stop genocide of UA over 7 yearsI like your answer more than mine. If OP needs a virtual column - it can be done this way:
d.assign(some_col_y=d['some_col'].loc[d['i']].values)