I applied sum() on a groupby and I want to sort the values of the last column

16,103

Solution 1

Suppose df is:

     user_ID  product_id  amount
0        1         456       1
1        1          87       1
2        1         788       3
3        1         456       5
4        1          87       2
5        2         456       1
6        2         788       3
7        2         456       5

Then you can use, groupby and sum as before, in addition you can sort values by two columns [user_ID, amount] and ascending=[True,False] refers ascending order of user and for each user descending order of amount:

new_df = df.groupby(['user_ID','product_id'], sort=True).sum().reset_index()
new_df = new_df.sort_values(by = ['user_ID', 'amount'], ascending=[True,False])
print(new_df)

Output:

     user_ID   product_id  amount
1        1         456       6
0        1          87       3
2        1         788       3
3        2         456       6
4        2         788       3

Solution 2

This would give you the top 5 largest:

# n  = number of rows you want to return
df.groupby(['user_id'])['amount'].sum().nlargest(n)
Share:
16,103

Related videos on Youtube

KawtarZZ
Author by

KawtarZZ

Updated on June 04, 2022

Comments

  • KawtarZZ
    KawtarZZ almost 2 years

    Given the following DataFrame

    user_ID  product_id  amount
       1       456          1
       1        87          1
       1       788          3
       1       456          5
       1        87          2
      ...      ...         ...
    

    The first column is the ID of the customer, the second is the ID of the product he bought and the 'amount' express if the quantity of the product purchased on that given day (the date is also taken into consideration). a customer can buy many products each day as much as he wants to. I want to calculate the total of times each product is bought by the customer, so I applied a groupby

    df.groupby(['user_id','product_id'], sort=True).sum()
    

    now I want to sort the sum of amount in each group. Any help?

  • Malinda
    Malinda almost 3 years
    this is to sort whole data frame. This won't help, if you want to sort elements in each group.