Adding values to all rows of dataframe
Solution 1
When using mean
on df1
, it calculates over each column by default and produces a pd.Series
.
When adding adding a pd.Series
to a pd.DataFrame
it aligns the index of the pd.Series
with the columns of the pd.DataFrame
and broadcasts along the index of the pd.DataFrame
... by default.
The only tricky bit is handling the Date
column.
Option 1
m = df1.mean()
df2.loc[:, m.index] += m
df2
Date c1 c2 c3 c4 c5 c6 c10
0 2017-09-12 1.5 1.0 2.65 1.45 2.5 3.0 3.3
1 2017-09-13 0.8 2.7 2.45 1.95 3.7 1.9 2.5
2 2017-10-10 2.1 1.8 2.75 1.45 2.6 2.9 3.1
3 2017-10-11 3.3 2.0 3.15 0.95 1.8 1.9 2.7
If I know that 'Date'
is always in the first column, I can:
df2.iloc[:, 1:] += df1.mean()
df2
Date c1 c2 c3 c4 c5 c6 c10
0 2017-09-12 1.5 1.0 2.65 1.45 2.5 3.0 3.3
1 2017-09-13 0.8 2.7 2.45 1.95 3.7 1.9 2.5
2 2017-10-10 2.1 1.8 2.75 1.45 2.6 2.9 3.1
3 2017-10-11 3.3 2.0 3.15 0.95 1.8 1.9 2.7
Option 2
Notice that I use the append=True
parameter in the set_index
just incase there are things in the index you don't want to mess up.
df2.set_index('Date', append=True).add(df1.mean()).reset_index('Date')
Date c1 c2 c3 c4 c5 c6 c10
0 2017-09-12 1.5 1.0 2.65 1.45 2.5 3.0 3.3
1 2017-09-13 0.8 2.7 2.45 1.95 3.7 1.9 2.5
2 2017-10-10 2.1 1.8 2.75 1.45 2.6 2.9 3.1
3 2017-10-11 3.3 2.0 3.15 0.95 1.8 1.9 2.7
If you don't care about the index, you can shorten this to
df2.set_index('Date').add(df1.mean()).reset_index()
Date c1 c2 c3 c4 c5 c6 c10
0 2017-09-12 1.5 1.0 2.65 1.45 2.5 3.0 3.3
1 2017-09-13 0.8 2.7 2.45 1.95 3.7 1.9 2.5
2 2017-10-10 2.1 1.8 2.75 1.45 2.6 2.9 3.1
3 2017-10-11 3.3 2.0 3.15 0.95 1.8 1.9 2.7
Solution 2
If all columns are in both data frames, then just
for col in df2.columns:
df2[col] = df2[col] + df1[col].mean()
if the columns are not necessarily in both then:
for col in df2.columns:
if col in df1.columns:
df2[col] = df2[col] + df1[col].mean()
Solution 3
There is probably a more efficient way but here is a quick and dirty solution. I hope this helps!
d = {'c1': [0.5,0.7], 'c2': [0.6,1.2],'c3': [1.2,1.3]}
df1 = pd.DataFrame(data=d, index=['2017-09-10','2017-09-11'])
df2 = pd.DataFrame(data=d, index=['2017-09-12','2017-09-13'])
df1
Date c1 c2 c3
2017-09-10 0.5 0.6 1.2
2017-09-11 0.7 1.2 1.3
df2
Date c1 c2 c3
2017-09-12 0.5 0.6 1.2
2017-09-13 0.7 1.2 1.3
The averages of each column in df1 can be obtained using the describe() function
df1.describe().ix['mean']
c1 0.60
c2 0.90
c3 1.25
And now, simply add the series to df2
df2 + df1.describe().ix['mean']
Date c1 c2 c3
2017-09-12 1.1 1.5 2.45
2017-09-13 1.3 2.1 2.55
Related videos on Youtube
Jagruth
Updated on June 04, 2022Comments
-
Jagruth almost 2 years
I have two pandas dataframes df1 (of length 2) and df2 (of length about 30 rows). Index values of df1 are always different and never occur in df2. I would like to add the average of columns from df1 to corresponding columns of df2. Example: add 0.6 to all rows of c1 and 0.9 to all rows of c2 etc ...
df1: Date c1 c2 c3 c4 c5 c6 ... c10 2017-09-10 0.5 0.6 1.2 0.7 1.3 1.8 ... 1.3 2017-09-11 0.7 1.2 1.3 0.4 0.7 0.4 ... 1.5 df2: Date c1 c2 c3 c4 c5 c6 ... c10 2017-09-12 0.9 0.1 1.4 0.9 1.5 1.9 ... 1.9 2017-09-13 0.2 1.8 1.2 1.4 2.7 0.8 ... 1.1 : : : : 2017-10-10 1.5 0.9 1.5 0.9 1.6 1.8 ... 1.7 2017-10-11 2.7 1.1 1.9 0.4 0.8 0.8 ... 1.3
How can I do that ?
-
jezrael over 6 yearsWhat is index value of appended row?
-