Group By a Column and Sum contents of another column with Python
13,080
You can add all columns to []
for aggregating:
print (df.groupby(by=['class_energy'])['ACT_TIME_AERATEUR_1_F1', 'ACT_TIME_AERATEUR_1_F3','ACT_TIME_AERATEUR_1_F5'].sum())
ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
class_energy
high 45.670000 0.000000
low 63.333333 63.333333
medium 0.000000 20.000000
ACT_TIME_AERATEUR_1_F5
class_energy
high 55.940000
low 87.323333
medium 23.990000
You can use also parameter as_index=False
:
print (df.groupby(by=['class_energy'], as_index=False)['ACT_TIME_AERATEUR_1_F1', 'ACT_TIME_AERATEUR_1_F3','ACT_TIME_AERATEUR_1_F5'].sum())
class_energy ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
0 high 45.670000 0.000000
1 low 63.333333 63.333333
2 medium 0.000000 20.000000
ACT_TIME_AERATEUR_1_F5
0 55.940000
1 87.323333
2 23.990000
If need aggregate only first 3
columns:
print (df.groupby(by=['class_energy'], as_index=False)[df.columns[:3]].sum())
class_energy ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
0 high 45.670000 0.000000
1 low 63.333333 63.333333
2 medium 0.000000 20.000000
ACT_TIME_AERATEUR_1_F5
0 55.940000
1 87.323333
2 23.990000
...or all columns without last:
print (df.groupby(by=['class_energy'], as_index=False)[df.columns[:-1]].sum())
class_energy ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
0 high 45.670000 0.000000
1 low 63.333333 63.333333
2 medium 0.000000 20.000000
ACT_TIME_AERATEUR_1_F5
0 55.940000
1 87.323333
2 23.990000
Related videos on Youtube
Author by
Poisson
Updated on September 15, 2022Comments
-
Poisson over 1 year
I have a dataframe
merged_df_energy
:+------------------------+------------------------+------------------------+--------------+ | ACT_TIME_AERATEUR_1_F1 | ACT_TIME_AERATEUR_1_F3 | ACT_TIME_AERATEUR_1_F5 | class_energy | +------------------------+------------------------+------------------------+--------------+ | 63.333333 | 63.333333 | 63.333333 | low | | 0 | 0 | 0 | high | | 45.67 | 0 | 55.94 | high | | 0 | 0 | 23.99 | low | | 0 | 20 | 23.99 | medium | +------------------------+------------------------+------------------------+--------------+
I would like to create for each
ACT_TIME_AERATEUR_1_Fx
(ACT_TIME_AERATEUR_1_F1
,ACT_TIME_AERATEUR_1_F3
andACT_TIME_AERATEUR_1_F5
) a dataframe which contains these columns:class_energy
andsum_time
For example for the dataframe corresponding to
ACT_TIME_AERATEUR_1_F1
:+-----------------+-----------+ | class_energy | sum_time | +-----------------+-----------+ | low | 63.333333 | | medium | 0 | | high | 45.67 | +-----------------+-----------+
I thing to do I should use the group by like this:
data.groupby(by=['class_energy'])['sum_time'].sum()
How can I do this?
-
jezrael over 7 yearsThank you for upvoting. Can I edit your question to be more readible?