Adding total row to pandas DataFrame groupby
11,135
Solution 1
You can try stacking it one level at a time:
m = df.groupby(['Level','Company','Item'])['Value'].sum().unstack(level=['Company','Item'])
m = m.assign(total=m.sum(1))
m = m.stack(level='Company')
m = m.assign(total=m.sum(1))
m = m.stack(level='Item')
The output has duplicate totals though:
Level Company Item
1 X a 100.0
b 200.0
total 300.0
Y a 35.0
b 150.0
c 35.0
total 220.0
total 520.0
total 520.0
2 X a 48.0
b 100.0
c 50.0
total 198.0
Y a 80.0
total 80.0
total 278.0
total 278.0
dtype: float64
Solution 2
You can use:
m=df.groupby(['Level','Company','Item'])['Value'].sum().unstack()
m.assign(total=m.sum(1)).stack().to_frame('Value')
Value
Level Company Item
1 X a 100.0
b 200.0
total 300.0
Y a 35.0
b 150.0
c 35.0
total 220.0
2 X a 48.0
b 100.0
c 50.0
total 198.0
Y a 80.0
total 80.0
Solution 3
Try this: Basically, this is creating two new dfs from the using the sum of the two groups and concating the three data frames
level = list(map(int, list('111112222')))
company = list('XXYYYXXXY')
item = list('ababcabca')
value = [100,200,35,150,35,48,100,50,80]
col = ['Level', 'Company', 'Item', 'Value']
df = pd.DataFrame([level,company,item,value]).T
df.columns = col
df1 = (df.groupby(['Level', 'Company', 'Item'])['Value'].sum())
df2 = (df1.sum(level=0).to_frame().assign(Company='total').set_index('Company', append=True))
df3 = (df1.groupby(['Level','Company']).sum().to_frame().assign(Item='total').set_index('Item', append=True))
dfx = pd.concat([df1.to_frame().reset_index(),
df2.reset_index(),
df3.reset_index()],sort=False)
print(dfx)
Output:
Level Company Item Value
0 1 X a 100
1 1 X b 200
2 1 Y a 35
3 1 Y b 150
4 1 Y c 35
5 2 X a 48
6 2 X b 100
7 2 X c 50
8 2 Y a 80
0 1 total NaN 520
1 2 total NaN 278
0 1 X total 300
1 1 Y total 220
2 2 X total 198
3 2 Y total 80
This is not sorted though as you expect. If I concat the 3 dfs without resetting the index I'm getting the expected sort order, but the index is a multi-index column
dfx = pd.concat([df1.to_frame(), df2, df3]).sort_index()
Output
Value
(1, X, a) 100
(1, X, b) 200
(1, X, total) 300
(1, Y, a) 35
(1, Y, b) 150
(1, Y, c) 35
(1, Y, total) 220
(1, total) 520
(2, X, a) 48
(2, X, b) 100
(2, X, c) 50
(2, X, total) 198
(2, Y, a) 80
(2, Y, total) 80
(2, total) 278
I am not sure how to convert this to columns as in your df.
Author by
user12392864
Updated on June 13, 2022Comments
-
user12392864 almost 2 years
I am aware of this link but I didn't manage to solve my problem.
I have this below DataFrame from
pandas.DataFrame.groupby().sum()
:Value Level Company Item 1 X a 100 b 200 Y a 35 b 150 c 35 2 X a 48 b 100 c 50 Y a 80
and would like to add total rows for each level of index that I have to get:
Value Level Company Item 1 X a 100 b 200 Total 300 Y a 35 b 150 c 35 Total 520 Total 820 2 X a 48 b 100 c 50 Total 198 Y a 80 Total 80 Total 278 Total 1098
As request
level = list(map(int, list('111112222'))) company = list('XXYYYXXXY') item = list('ababcabca') value = [100,200,35,150,35,48,100,50,80] col = ['Level', 'Company', 'Item', 'Value'] df = pd.DataFrame([level,company,item,value]).T df.columns = col df.groupby(['Level', 'Company', 'Item']).sum()
-
davidbilla over 4 yearsIs this what you're expecting?
-
user12392864 over 4 yearsExactly what I wanted, thanks! Could you help on stackoverflow.com/questions/59364298/… ? @Yen
-
user12392864 over 4 yearsThe last dataframe has a KeyError: ‘Level’
-
davidbilla over 4 yearsDo you have a column called 'Level' in your dataframe? Can you post which statement exactly you are getting the error?
-
user12392864 over 4 yearsI do, on the initial one. The error is on the 4th line. df1 is a Series so you can’t groupby I guess this is the issue.
-
davidbilla over 4 yearsUsing your example, I am able to run this statement
print(df1.groupby(['Level','Company']).sum().to_frame().assign(Company='total').set_index('Company', append=True))
without any errors. -
user12392864 over 4 yearsis the
df1
ondf1.groupby(['Level','Company']).sum().to_frame().assign(Company='total').set_index('Company', append=True)
the same than thedf1
in this following line:df1 = (df.groupby(['Level', 'Company', 'Item'])['Value'].sum())
. If yes, how could wegroupby
df1
once more ? After the first line,df1
becomes apandas.Series
and we can'tgroupby
pandas.Series
. -
davidbilla over 4 yearsYes, df1 is the result of
df1 = (df.groupby(['Level', 'Company', 'Item'])['Value'].sum())
. See pandas.Series.groupby -
user12392864 over 4 yearsCan I see your output please?
-
user12392864 over 4 yearsLet us continue this discussion in chat.
-
user12392864 over 4 yearsthanks. It doesn’t work on Jupyter but does work on Pycharm.
-
davidbilla over 4 yearsSuper. That may be due to version differences. Are you using the same pandas version in both IDEs?