Pandas sum multiple dataframes

29,016

use the add method with fill_value=0 parameter.

df1 = pd.DataFrame({'val':{'a': 1, 'b':2, 'c':3}})
df2 = pd.DataFrame({'val':{'a': 1, 'b':2, 'd':3}})

df1.add(df2, fill_value=0)

   val
a  2.0
b  4.0
c  3.0
d  3.0

MultiIndex example

idx1 = pd.MultiIndex.from_tuples([('a', 'A'), ('a', 'B'), ('b', 'A'), ('b', 'D')])
idx2 = pd.MultiIndex.from_tuples([('a', 'A'), ('a', 'C'), ('b', 'A'), ('b', 'C')])

np.random.seed([3,1415])
df1 = pd.DataFrame(np.random.randn(4, 1), idx1, ['val'])
df2 = pd.DataFrame(np.random.randn(4, 1), idx2, ['val'])

df1

          val
a A -2.129724
  B -1.268466
b A -1.970500
  D -2.259055

df2

          val
a A -0.349286
  C -0.026955
b A  0.316236
  C  0.348782

df1.add(df2, fill_value=0)

          val
a A -2.479011
  B -1.268466
  C -0.026955
b A -1.654264
  C  0.348782
  D -2.259055

More than 2 dataframes

from functools import reduce

df1 = pd.DataFrame({'val':{'a': 1, 'b':2, 'c':3}})
df2 = pd.DataFrame({'val':{'a': 1, 'b':2, 'd':3}})
df3 = pd.DataFrame({'val':{'e': 1, 'c':2, 'd':3}})
df4 = pd.DataFrame({'val':{'f': 1, 'a':2, 'd':3}})
df5 = pd.DataFrame({'val':{'g': 1, 'f':2, 'd':3}})

reduce(lambda a, b: a.add(b, fill_value=0), [df1, df2, df3, df4, df5])

    val
a   4.0
b   4.0
c   5.0
d  12.0
e   1.0
f   3.0
g   1.0
Share:
29,016

Related videos on Youtube

hangc
Author by

hangc

Software Engineer. AWS Certified.

Updated on July 09, 2022

Comments

  • hangc
    hangc almost 2 years

    I have multiple dataframes each with a multi-level-index and a value column. I want to add up all the dataframes on the value columns.

    df1 + df2

    Not all the indexes are complete in each dataframe, hence I am getting nan on a row which is not present in all the dataframes.

    How can I overcome this and treat rows which are not present in any dataframe as having a value of 0?

    Eg. I want to get

       val
    a    2
    b    4
    c    3
    d    3
    

    from pd.DataFrame({'val':{'a': 1, 'b':2, 'c':3}}) + pd.DataFrame({'val':{'a': 1, 'b':2, 'd':3}}) instead of

       val
    a    2
    b    4
    c  NaN
    d  NaN
    
  • MaxU - stop genocide of UA
    MaxU - stop genocide of UA almost 8 years
    very neat answer! will it also work with a multiindex DFs?
  • Zed Fang
    Zed Fang about 7 years
    If I have 3 dataframe, how to use add in a very simple way?
  • piRSquared
    piRSquared about 7 years
    The answer I'd give would be different than this. I suggest you ask another question. That way everyone gets the benefit of seeing it.
  • piRSquared
    piRSquared about 7 years
    @ZedFang I'll be waiting with an answer for when you ask this follow up question.
  • schnaidar
    schnaidar about 3 years
    @piRSquared I think it would have helped everyone the most, if you just wrote your answer here, instead of adding two comments about how you would answer if there just was another question. So, did you answer somewhere?
  • piRSquared
    piRSquared about 3 years
    @schnaidar fair enough. I updated my answer.
  • questionto42standswithUkraine
    questionto42standswithUkraine almost 3 years
    Mind that this will only work when the column names are the same in both dataframes. Else, it will concatenate the two dfs to.