Group By a Column and Sum contents of another column with Python

13,080

You can add all columns to [] for aggregating:

print (df.groupby(by=['class_energy'])['ACT_TIME_AERATEUR_1_F1', 'ACT_TIME_AERATEUR_1_F3','ACT_TIME_AERATEUR_1_F5'].sum())
              ACT_TIME_AERATEUR_1_F1  ACT_TIME_AERATEUR_1_F3  \
class_energy                                                   
high                       45.670000                0.000000   
low                        63.333333               63.333333   
medium                      0.000000               20.000000   

              ACT_TIME_AERATEUR_1_F5  
class_energy                          
high                       55.940000  
low                        87.323333  
medium                     23.990000  

You can use also parameter as_index=False:

print (df.groupby(by=['class_energy'], as_index=False)['ACT_TIME_AERATEUR_1_F1', 'ACT_TIME_AERATEUR_1_F3','ACT_TIME_AERATEUR_1_F5'].sum())
  class_energy  ACT_TIME_AERATEUR_1_F1  ACT_TIME_AERATEUR_1_F3  \
0         high               45.670000                0.000000   
1          low               63.333333               63.333333   
2       medium                0.000000               20.000000   

   ACT_TIME_AERATEUR_1_F5  
0               55.940000  
1               87.323333  
2               23.990000  

If need aggregate only first 3 columns:

print (df.groupby(by=['class_energy'], as_index=False)[df.columns[:3]].sum())
  class_energy  ACT_TIME_AERATEUR_1_F1  ACT_TIME_AERATEUR_1_F3  \
0         high               45.670000                0.000000   
1          low               63.333333               63.333333   
2       medium                0.000000               20.000000   

   ACT_TIME_AERATEUR_1_F5  
0               55.940000  
1               87.323333  
2               23.990000  

...or all columns without last:

print (df.groupby(by=['class_energy'], as_index=False)[df.columns[:-1]].sum())
  class_energy  ACT_TIME_AERATEUR_1_F1  ACT_TIME_AERATEUR_1_F3  \
0         high               45.670000                0.000000   
1          low               63.333333               63.333333   
2       medium                0.000000               20.000000   

   ACT_TIME_AERATEUR_1_F5  
0               55.940000  
1               87.323333  
2               23.990000  
Share:
13,080

Related videos on Youtube

Poisson
Author by

Poisson

Updated on September 15, 2022

Comments

  • Poisson
    Poisson over 1 year

    I have a dataframe merged_df_energy:

    +------------------------+------------------------+------------------------+--------------+
    | ACT_TIME_AERATEUR_1_F1 | ACT_TIME_AERATEUR_1_F3 | ACT_TIME_AERATEUR_1_F5 | class_energy |
    +------------------------+------------------------+------------------------+--------------+
    | 63.333333              | 63.333333              | 63.333333              | low          |
    | 0                      | 0                      | 0                      | high         |
    | 45.67                  | 0                      | 55.94                  | high         |
    | 0                      | 0                      | 23.99                  | low          |
    | 0                      | 20                     | 23.99                  | medium       |
    +------------------------+------------------------+------------------------+--------------+
    

    I would like to create for each ACT_TIME_AERATEUR_1_Fx (ACT_TIME_AERATEUR_1_F1, ACT_TIME_AERATEUR_1_F3 and ACT_TIME_AERATEUR_1_F5) a dataframe which contains these columns: class_energy and sum_time

    For example for the dataframe corresponding to ACT_TIME_AERATEUR_1_F1:

    +-----------------+-----------+
    |  class_energy   | sum_time  |
    +-----------------+-----------+
    | low             | 63.333333 |
    | medium          | 0         |
    | high            | 45.67     |
    +-----------------+-----------+
    

    I thing to do I should use the group by like this:

    data.groupby(by=['class_energy'])['sum_time'].sum()
    

    How can I do this?

  • jezrael
    jezrael over 7 years
    Thank you for upvoting. Can I edit your question to be more readible?