Pandas sum multiple dataframes
29,016
use the add
method with fill_value=0
parameter.
df1 = pd.DataFrame({'val':{'a': 1, 'b':2, 'c':3}})
df2 = pd.DataFrame({'val':{'a': 1, 'b':2, 'd':3}})
df1.add(df2, fill_value=0)
val
a 2.0
b 4.0
c 3.0
d 3.0
MultiIndex example
idx1 = pd.MultiIndex.from_tuples([('a', 'A'), ('a', 'B'), ('b', 'A'), ('b', 'D')])
idx2 = pd.MultiIndex.from_tuples([('a', 'A'), ('a', 'C'), ('b', 'A'), ('b', 'C')])
np.random.seed([3,1415])
df1 = pd.DataFrame(np.random.randn(4, 1), idx1, ['val'])
df2 = pd.DataFrame(np.random.randn(4, 1), idx2, ['val'])
df1
val
a A -2.129724
B -1.268466
b A -1.970500
D -2.259055
df2
val
a A -0.349286
C -0.026955
b A 0.316236
C 0.348782
df1.add(df2, fill_value=0)
val
a A -2.479011
B -1.268466
C -0.026955
b A -1.654264
C 0.348782
D -2.259055
More than 2 dataframes
from functools import reduce
df1 = pd.DataFrame({'val':{'a': 1, 'b':2, 'c':3}})
df2 = pd.DataFrame({'val':{'a': 1, 'b':2, 'd':3}})
df3 = pd.DataFrame({'val':{'e': 1, 'c':2, 'd':3}})
df4 = pd.DataFrame({'val':{'f': 1, 'a':2, 'd':3}})
df5 = pd.DataFrame({'val':{'g': 1, 'f':2, 'd':3}})
reduce(lambda a, b: a.add(b, fill_value=0), [df1, df2, df3, df4, df5])
val
a 4.0
b 4.0
c 5.0
d 12.0
e 1.0
f 3.0
g 1.0
Related videos on Youtube
Comments
-
hangc almost 2 years
I have multiple dataframes each with a multi-level-index and a value column. I want to add up all the dataframes on the value columns.
df1 + df2
Not all the indexes are complete in each dataframe, hence I am getting
nan
on a row which is not present in all the dataframes.How can I overcome this and treat rows which are not present in any dataframe as having a value of 0?
Eg. I want to get
val a 2 b 4 c 3 d 3
from
pd.DataFrame({'val':{'a': 1, 'b':2, 'c':3}}) + pd.DataFrame({'val':{'a': 1, 'b':2, 'd':3}})
instead ofval a 2 b 4 c NaN d NaN
-
MaxU - stop genocide of UA almost 8 yearsvery neat answer! will it also work with a multiindex DFs?
-
Zed Fang about 7 yearsIf I have 3 dataframe, how to use add in a very simple way?
-
piRSquared about 7 yearsThe answer I'd give would be different than this. I suggest you ask another question. That way everyone gets the benefit of seeing it.
-
piRSquared about 7 years@ZedFang I'll be waiting with an answer for when you ask this follow up question.
-
schnaidar about 3 years@piRSquared I think it would have helped everyone the most, if you just wrote your answer here, instead of adding two comments about how you would answer if there just was another question. So, did you answer somewhere?
-
piRSquared about 3 years@schnaidar fair enough. I updated my answer.
-
questionto42standswithUkraine almost 3 yearsMind that this will only work when the column names are the same in both dataframes. Else, it will concatenate the two dfs to.