I applied sum() on a groupby and I want to sort the values of the last column
16,103
Solution 1
Suppose df
is:
user_ID product_id amount
0 1 456 1
1 1 87 1
2 1 788 3
3 1 456 5
4 1 87 2
5 2 456 1
6 2 788 3
7 2 456 5
Then you can use, groupby
and sum
as before, in addition you can sort values by two columns [user_ID, amount]
and ascending=[True,False]
refers ascending order of user and for each user descending order of amount:
new_df = df.groupby(['user_ID','product_id'], sort=True).sum().reset_index()
new_df = new_df.sort_values(by = ['user_ID', 'amount'], ascending=[True,False])
print(new_df)
Output:
user_ID product_id amount
1 1 456 6
0 1 87 3
2 1 788 3
3 2 456 6
4 2 788 3
Solution 2
This would give you the top 5 largest:
# n = number of rows you want to return
df.groupby(['user_id'])['amount'].sum().nlargest(n)
Related videos on Youtube
Author by
KawtarZZ
Updated on June 04, 2022Comments
-
KawtarZZ almost 2 years
Given the following DataFrame
user_ID product_id amount 1 456 1 1 87 1 1 788 3 1 456 5 1 87 2 ... ... ...
The first column is the ID of the customer, the second is the ID of the product he bought and the 'amount' express if the quantity of the product purchased on that given day (the date is also taken into consideration). a customer can buy many products each day as much as he wants to. I want to calculate the total of times each product is bought by the customer, so I applied a
groupby
df.groupby(['user_id','product_id'], sort=True).sum()
now I want to sort the sum of amount in each group. Any help?
-
Malinda almost 3 yearsthis is to sort whole data frame. This won't help, if you want to sort elements in each group.