Adding a Grand Total to a Pandas Pivot Table
10,164
Based on your example that is posted:
# read your data from clipboard
df = pd.read_clipboard()
# run your pivot_table code from above
report = df.groupby(['SUPER_TYPE']).apply(lambda sub_df: sub_df.pivot_table(index=['STRATA', 'OS_TYPE', 'STAND_NUMB', 'SILV_PRES'], values=['ACRES'],aggfunc=np.sum, margins=True,margins_name= 'TOTAL'))
# this is creating a new row at level(1) called grand total
# set it equal to the sum of ACRES where level(1) != 'Total' so you are not counting the calculated totals in the total sum
report.loc[('', 'Grand Total','','',''), :] = report[report.index.get_level_values(1) != 'TOTAL'].sum()
report
ACRES
SUPER_TYPE STRATA OS_TYPE STAND_NUMB SILV_PRES
HS HS3B HS3B 3092.0 OSR/2SS/SCC 17.3
3580.0 OSR/2SS/SCC 8.1
3581.0 OSR/2SS/SCC 16.6
3587.0 OSR/2SS/SCC 13.8
3594.0 OSR/2SS/SCC 31.7
3607.0 OSR/2SS/SCC 27.7
TOTAL 115.2
HW H3AB H3A 3571.0 OSR/2SS/SCC 30.7
3573.0 OSR/2SS/SCC 30.4
3585.0 OSR/2SS/SCC 25.8
3588.0 OSR/2SS/SCC 18.1
3589.0 OSR/2SS/SCC 54.7
3597.0 OSR/2SS/SCC 41.6
3601.0 OSR/2SS/SCC 11.9
. . .
Grand Total 813.6
Related videos on Youtube
Author by
Clickinaway
Updated on June 04, 2022Comments
-
Clickinaway almost 2 years
I'm stuck. In my code below I can successfully create the subtotaled pivot table I'm looking for but cannot produce a grand total. [The following code leverages the arcgis module; this simply converts a table (in this case an MSSQL table) to a NumPy array]
import numpy as np import pandas as pd import arcpy table = "\\\\filserver\\MAP_PROJECTS\\LV_WEB\\SDE_CONNECTIONS\\LV_NEXUS.sde\\LV_NEXUS.DATAOWNER.NORTHEAST\\LV_NEXUS.DATAOWNER.NE_HARVEST_OPS" HUID = "669-NMTC-139" whereClause = """ "LV_HARVEST_UNIT_ID" = '{0}' """.format(HUID) tableArray = arcpy.da.TableToNumPyArray(table, ['STAND_NUMB', 'SUPER_TYPE','STRATA', 'OS_TYPE', 'SILV_PRES', 'ACRES'], where_clause = whereClause) df = pd.DataFrame(tableArray) report = df.groupby(['SUPER_TYPE']).apply(lambda sub_df: sub_df.pivot_table(index=['STRATA', 'OS_TYPE', 'STAND_NUMB', 'SILV_PRES'], values=['ACRES'],aggfunc=np.sum, margins=True,margins_name= 'TOTAL')) np.round(report,1)
This provides a total for each 'SUPER_TYPE' group, but I cannot create a grand total. I tried the following:
grandtotal = np.round(np.sum(report),1) grandtotal.name = 'Grand Total' report.append(grandtotal)
and that just creates a terrible mess. It appends the grand total but destroys the formatting of my data frame.
Dataframe pasted below not sure how to keep the formatting
STAND_NUMB SUPER_TYPE STRATA OS_TYPE SILV_PRES ACRES 0 3113 SH SH3B SH3B OSR/2SS/SCC 0.612748 1 3608 HW H3AB H3B OSR/2SS/SCC 12.936038 2 3105 HW H3AB H3B OSR/2SS/SCC 35.199887 3 3607 HS HS3B HS3B OSR/2SS/SCC 27.683348 4 3601 HW H3AB H3A OSR/2SS/SCC 11.941338 5 3603 HW H4B H4B OSR/2SS/SCC 25.307238 6 3092 HS HS3B HS3B OSR/2SS/SCC 17.331220 7 3600 HW H4B H4B OSR/2SS/SCC 13.443112 8 3596 HW H3AB H3B OSR/2SS/SCC 12.375962 9 3597 HW H3AB H3A OSR/2SS/SCC 41.639072 10 3591 SW S4BC S4A OSR/2SS/SCC 11.355869 11 3594 HS HS3B HS3B OSR/2SS/SCC 31.747874 12 3586 HW H3AB H3B OSR/2SS/SCC 19.437834 13 3588 HW H3AB H3A OSR/2SS/SCC 18.129702 14 3587 HS HS3B HS3B OSR/2SS/SCC 13.788853 15 3585 HW H3AB H3A OSR/2SS/SCC 25.775322 16 3582 SH SH3B SH3B OSR/2SS/SCC 11.026199 17 3581 HS HS3B HS3B OSR/2SS/SCC 16.634195 18 3589 HW H3AB H3A OSR/2SS/SCC 54.684222 19 3579 SH SH3B SH3B OSR/2SS/SCC 17.313354 20 3578 HW H4C_H2C H4C OSR/2SS/SCC 30.255013 21 3576 HW H3C H3C OSR/2SS/SCC 11.310230 22 3573 HW H3AB H3A OSR/2SS/SCC 30.369559 23 3575 HW H4C_H2C H4C OSR/2SS/SCC 53.088547 24 3569 HW H4A H4A OSR/2SS/SCC 12.809001 25 3567 HW H4B H4B OSR/2SS/SCC 24.026682 26 3568 HW H3AB H3B OSR/2SS/SCC 57.934207 27 3565 HW H4B H4B OSR/2SS/SCC 33.545768 28 3605 HW H3AB H3B OSR/2SS/SCC 74.424945 29 3580 HS HS3B HS3B OSR/2SS/SCC 8.062028 30 3571 HW H3AB H3A OSR/2SS/SCC 30.718121 31 3562 HW H3AB H3B OSR/2SS/SCC 22.774026 32 3110 SW S3C S3C OSR/2SS/SCC 2.240600 33 3120.1 SH SH3B SH3B OSR/2SS/SCC 3.726728
-
It_is_Chris over 5 yearscan you do something like this:
df.loc[('', 'TOTAL','','',''), :] = df['ACRES'].sum()
-
ALollz over 5 yearsHave a look at the solutions to Pivot table subtotals in Pandas
-
Clickinaway over 5 years@ALollz I've actually been trying to make that work for the last 30 minutes or so but no luck.
-
Clickinaway over 5 years@Chris KeyError: "['' 'TOTAL' '' '' ''] not in index"
-
It_is_Chris over 5 years@Clickinaway I am not getting an error on my test data you should be able to do
df.loc[('', 'Grand Total','','',''), :] = df[df.index.get_level_values(1) != 'TOTAL'].sum()
-
Parfait over 5 yearsPlease set up a reproducible example including all
import
lines and data we can run in our empty Python environments. -
Clickinaway over 5 years@Parfait I will do with the caveat that I am utilizing the arcgis python module which simply converts a table (in this case an SQL table) to a numpy array.
-
Clickinaway over 5 years@Chris "IndexError: Too many levels: Index has only 1 level, not 2" I can try to add another level to my index?
-
It_is_Chris over 5 years@Clickinaway what is the output of
df.index
your image shows a multiindex of 5 levels. Can you copy and paste your dataframe and not use an image so it can be copied? -
Clickinaway over 5 years@Chris RangeIndex(start=0, stop=34, step=1)
-
Clickinaway over 5 years@Chris I think there are only 4 values in my pivot's multiindex
-
It_is_Chris over 5 years@Clickinaway in my example it should be run on your final df (the pivoted df) as is
report
and not the variabledf
-
Clickinaway over 5 years@Chris that makes more sense. The output of report.index is quite large, is there something i can verify or should i just paste all of it?
-
Clickinaway over 5 years@Chris, wait...I got your example to work. It does generate the grand total. Do I then try to append it to my original 'report'? ChrisSum = report.loc[('', 'Grand Total','','',''), :] = report[report.index.get_level_values(1) != 'TOTAL'].sum() This returns: ACRES 813.648841 dtype: float64
-
Clickinaway over 5 years@Parfait I've spent a while reviewing the reproducible example link you sent...I'm really not sure how to get around the multiindex issue! I'm very sorry.
-
Clickinaway over 5 years@Parfait thank you so much for reformatting!
-
It_is_Chris over 5 years@Clickinaway no need to do anything just do
report.loc[('', 'Grand Total','','',''), :] = report[report.index.get_level_values(1) != 'TOTAL'].sum()
and then callprint(report)
but it looks like you are in jupyter so just call report and your last row should have the sum -
Clickinaway over 5 years@Chris that worked!!! I will happily mark your answer as correct, thanks so much!
-
-
Kundan over 2 yearsHow can I add acres %, this will be % of acres value from the parent total (not the grand total) ?