pandas concat ignore_index doesn't work
Solution 1
If I understood you correctly, this is what you would like to do.
import pandas as pd
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 2, 3,4])
df2 = pd.DataFrame({'A1': ['A4', 'A5', 'A6', 'A7'],
'C': ['C4', 'C5', 'C6', 'C7'],
'D2': ['D4', 'D5', 'D6', 'D7']},
index=[ 4, 5, 6 ,7])
df1.reset_index(drop=True, inplace=True)
df2.reset_index(drop=True, inplace=True)
df = pd.concat( [df1, df2], axis=1)
Which gives:
A B D A1 C D2
0 A0 B0 D0 A4 C4 D4
1 A1 B1 D1 A5 C5 D5
2 A2 B2 D2 A6 C6 D6
3 A3 B3 D3 A7 C7 D7
Actually, I would have expected that df = pd.concat(dfs,axis=1,ignore_index=True)
gives the same result.
This is the excellent explanation from jreback:
ignore_index=True
‘ignores’, meaning doesn’t align on the joining axis. it simply pastes them together in the order that they are passed, then reassigns a range for the actual index (e.g.range(len(index))
) so the difference between joining on non-overlapping indexes (assumeaxis=1
in the example), is that withignore_index=False
(the default), you get the concat of the indexes, and withignore_index=True
you get a range.
Solution 2
The ignore_index option is working in your example, you just need to know that it is ignoring the axis of concatenation which in your case is the columns. (Perhaps a better name would be ignore_labels.) If you want the concatenation to ignore the index labels, then your axis variable has to be set to 0 (the default).
Solution 3
In case you want to retain the index of the left data frame, set the index of df2 to be df1 using set_index
:
pd.concat([df1, df2.set_index(df1.index)], axis=1)
Solution 4
Agree with the comments, always best to post expected output.
Is this what you are seeking?
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 2, 3,4])
df2 = pd.DataFrame({'A1': ['A4', 'A5', 'A6', 'A7'],
'C': ['C4', 'C5', 'C6', 'C7'],
'D2': ['D4', 'D5', 'D6', 'D7']},
index=[ 5, 6, 7,3])
df1 = df1.transpose().reset_index(drop=True).transpose()
df2 = df2.transpose().reset_index(drop=True).transpose()
dfs = [df1,df2]
df = pd.concat( dfs,axis=0,ignore_index=True)
print df
0 1 2
0 A0 B0 D0
1 A1 B1 D1
2 A2 B2 D2
3 A3 B3 D3
4 A4 C4 D4
5 A5 C5 D5
6 A6 C6 D6
7 A7 C7 D7
Solution 5
You can use numpy's concatenate to achieve the result.
cols = df1.columns.to_list() + df2.columns.to_list()
dfs = [df1,df2]
df = np.concatenate(dfs, axis=1)
df = pd.DataFrame(df, columns=cols)
Out[1]:
A B D A1 C D2
0 A0 B0 D0 A4 C4 D4
1 A1 B1 D1 A5 C5 D5
2 A2 B2 D2 A6 C6 D6
3 A3 B3 D3 A7 C7 D7
Related videos on Youtube
Comments
-
muon over 2 years
I am trying to column-bind dataframes and having issue with pandas
concat
, asignore_index=True
doesn't seem to work:df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'], 'B': ['B0', 'B1', 'B2', 'B3'], 'D': ['D0', 'D1', 'D2', 'D3']}, index=[0, 2, 3,4]) df2 = pd.DataFrame({'A1': ['A4', 'A5', 'A6', 'A7'], 'C': ['C4', 'C5', 'C6', 'C7'], 'D2': ['D4', 'D5', 'D6', 'D7']}, index=[ 5, 6, 7,3]) df1 # A B D # 0 A0 B0 D0 # 2 A1 B1 D1 # 3 A2 B2 D2 # 4 A3 B3 D3 df2 # A1 C D2 # 5 A4 C4 D4 # 6 A5 C5 D5 # 7 A6 C6 D6 # 3 A7 C7 D7 dfs = [df1,df2] df = pd.concat( dfs,axis=1,ignore_index=True) print df
and the result is
0 1 2 3 4 5 0 A0 B0 D0 NaN NaN NaN 2 A1 B1 D1 NaN NaN NaN 3 A2 B2 D2 A7 C7 D7 4 A3 B3 D3 NaN NaN NaN 5 NaN NaN NaN A4 C4 D4 6 NaN NaN NaN A5 C5 D5 7 NaN NaN NaN A6 C6 D6
Even if I reset index using
df1.reset_index() df2.reset_index()
and then try
pd.concat([df1,df2],axis=1)
it still produces the same result!
-
Alex Riley over 8 yearsDoes
pd.concat([df1, df2], axis=0, ignore_index=True)
produce what you want? If not, can you specify your expected output? -
muon over 8 yearsno, it binds the rows . I want to bind the columns (append). I tried append, that doesn't seem to work either.
-
cel over 8 years@ajcr, have you compared the output of
pd.concat([df1, df2], axis=1, ignore_index=True)
andpd.concat([df1, df2], axis=1)
? Shouldn't the first intuitively emulate acbind
? -
Alex Riley over 8 yearsI think
ignore_index
only ignores the labels on the axis you're joining on, so it still does an outer join on the index labels. I agree the names of function arguments aren't the most intuitive here. -
muon over 8 yearsyes, i realized that from @Alex answer ... but i have the same results even with ignore_index=False
-
-
muon over 8 yearsOh that works ... Thanks! Funny thing is I was using same method to bind dataframes inside a function and that was working fine! but one outside function wasn't
-
muon over 8 yearsThanks! that was helpful (can't upvote yet, low rep)
-
cel over 8 years@mau, I have updated my answer and now use
pd.reset_index()
. I think this is a cleaner way. -
muon over 8 yearsI happened to try that out myself, could have saved myself few hours if i had seen this earlier :). Thanks...
df = pd.concat( [df1.reset_index(drop=True), df2.reset_index(drop=True)], axis=1)
-
Hugo Santos Silva almost 4 yearsIndeed, this is a useful explanation that is missing in the docs.