Join dataframes - one with multiindex columns and the other without

12,533

Solution 1

It depends on what you want! Do you want the column from df2 to be aligned with the 1st or second level of columns from df?

You have to add a level to the columns of df2

Super cheezy with pd.concat

df.join(pd.concat([df2], axis=1, keys=['a']))

Better way

df2.columns = pd.MultiIndex.from_product([['a'], df2.columns])

df.join(df2)

enter image description here

Solution 2

I think simpliest is create MultiIndex in df2 and then use concat or join:

df2.columns = pd.MultiIndex.from_tuples([('a','w')])
print (df2)
          a
          w
A -0.562729
B -0.212032
C  0.102451
df2.columns = [['a'], df2.columns]
print (df2)
          a
          w
A -1.253881
B -0.637752
C  0.907105

df3 = pd.concat([df, df2], axis=1)

Or:

df3 = df.join(df2)

print (df3)
first        bar                 baz                 foo                 qux  \
second       one       two       one       two       one       two       one   
A      -0.269667  0.221566  1.138393  0.871762 -0.063132 -1.995682 -0.797885   
B      -0.456878  0.293350 -1.040748 -1.307871  0.002462  1.580711 -0.198943   
C      -0.691755 -0.279445 -0.809215 -0.006658  1.452484  0.516414 -0.295961   

first                    a  
second       two         w  
A       1.068843 -0.562729  
B       1.247057 -0.212032  
C      -0.345300  0.102451  
Share:
12,533

Related videos on Youtube

Eyal S.
Author by

Eyal S.

Updated on June 04, 2022

Comments

  • Eyal S.
    Eyal S. almost 2 years

    I'm trying to join two dataframes - one with multiindex columns and the other with a single column name. They have similar index.

    I get the following warning: "UserWarning: merging between different levels can give an unintended result (3 levels on the left, 1 on the right)"

    For example:

    arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
              ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
    tuples = list(zip(*arrays))
    index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
    df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
    df2 = pd.DataFrame(np.random.randn(3), index=['A', 'B', 'C'],columns=['w'])
    df3 = df.join(df2)
    

    What is the best way to join these two dataframes?

  • piRSquared
    piRSquared about 7 years
    @jezrael there you go :-)
  • BallpointBen
    BallpointBen about 6 years
    How did you create that table graphic?