Adding a Grand Total to a Pandas Pivot Table

10,164

Based on your example that is posted:

# read your data from clipboard
df = pd.read_clipboard()

# run your pivot_table code from above
report = df.groupby(['SUPER_TYPE']).apply(lambda sub_df:  sub_df.pivot_table(index=['STRATA', 'OS_TYPE', 'STAND_NUMB', 'SILV_PRES'], values=['ACRES'],aggfunc=np.sum, margins=True,margins_name= 'TOTAL'))

# this is creating a new row at level(1) called grand total
# set it equal to the sum of ACRES where level(1) != 'Total' so you are not counting the  calculated totals in the total sum
report.loc[('', 'Grand Total','','',''), :] = report[report.index.get_level_values(1) != 'TOTAL'].sum()
report


                                                     ACRES
SUPER_TYPE  STRATA  OS_TYPE STAND_NUMB  SILV_PRES   
HS          HS3B    HS3B      3092.0    OSR/2SS/SCC 17.3
                              3580.0    OSR/2SS/SCC 8.1
                              3581.0    OSR/2SS/SCC 16.6
                              3587.0    OSR/2SS/SCC 13.8
                              3594.0    OSR/2SS/SCC 31.7
                              3607.0    OSR/2SS/SCC 27.7
            TOTAL                                   115.2
HW          H3AB       H3A    3571.0    OSR/2SS/SCC 30.7
                              3573.0    OSR/2SS/SCC 30.4
                              3585.0    OSR/2SS/SCC 25.8
                              3588.0    OSR/2SS/SCC 18.1
                              3589.0    OSR/2SS/SCC 54.7
                              3597.0    OSR/2SS/SCC 41.6
                              3601.0    OSR/2SS/SCC 11.9
                     . . . 

         Grand Total                                813.6
Share:
10,164

Related videos on Youtube

Clickinaway
Author by

Clickinaway

Updated on June 04, 2022

Comments

  • Clickinaway
    Clickinaway almost 2 years

    I'm stuck. In my code below I can successfully create the subtotaled pivot table I'm looking for but cannot produce a grand total. [The following code leverages the arcgis module; this simply converts a table (in this case an MSSQL table) to a NumPy array]

    import numpy as np
    import pandas as pd
    import arcpy
    
    table = "\\\\filserver\\MAP_PROJECTS\\LV_WEB\\SDE_CONNECTIONS\\LV_NEXUS.sde\\LV_NEXUS.DATAOWNER.NORTHEAST\\LV_NEXUS.DATAOWNER.NE_HARVEST_OPS"
    HUID = "669-NMTC-139"
    whereClause = """ "LV_HARVEST_UNIT_ID" = '{0}' """.format(HUID)
    tableArray = arcpy.da.TableToNumPyArray(table, ['STAND_NUMB', 'SUPER_TYPE','STRATA', 'OS_TYPE', 'SILV_PRES', 'ACRES'], where_clause = whereClause)
    df = pd.DataFrame(tableArray)
    report = df.groupby(['SUPER_TYPE']).apply(lambda sub_df:  sub_df.pivot_table(index=['STRATA', 'OS_TYPE', 'STAND_NUMB', 'SILV_PRES'], values=['ACRES'],aggfunc=np.sum, margins=True,margins_name= 'TOTAL'))
    np.round(report,1)
    

    enter image description here

    This provides a total for each 'SUPER_TYPE' group, but I cannot create a grand total. I tried the following:

    grandtotal = np.round(np.sum(report),1)
    grandtotal.name = 'Grand Total'
    report.append(grandtotal)
    

    and that just creates a terrible mess. It appends the grand total but destroys the formatting of my data frame.

    Dataframe pasted below not sure how to keep the formatting

       STAND_NUMB SUPER_TYPE   STRATA OS_TYPE    SILV_PRES      ACRES
    0        3113         SH     SH3B    SH3B  OSR/2SS/SCC   0.612748
    1        3608         HW     H3AB     H3B  OSR/2SS/SCC  12.936038
    2        3105         HW     H3AB     H3B  OSR/2SS/SCC  35.199887
    3        3607         HS     HS3B    HS3B  OSR/2SS/SCC  27.683348
    4        3601         HW     H3AB     H3A  OSR/2SS/SCC  11.941338
    5        3603         HW      H4B     H4B  OSR/2SS/SCC  25.307238
    6        3092         HS     HS3B    HS3B  OSR/2SS/SCC  17.331220
    7        3600         HW      H4B     H4B  OSR/2SS/SCC  13.443112
    8        3596         HW     H3AB     H3B  OSR/2SS/SCC  12.375962
    9        3597         HW     H3AB     H3A  OSR/2SS/SCC  41.639072
    10       3591         SW     S4BC     S4A  OSR/2SS/SCC  11.355869
    11       3594         HS     HS3B    HS3B  OSR/2SS/SCC  31.747874
    12       3586         HW     H3AB     H3B  OSR/2SS/SCC  19.437834
    13       3588         HW     H3AB     H3A  OSR/2SS/SCC  18.129702
    14       3587         HS     HS3B    HS3B  OSR/2SS/SCC  13.788853
    15       3585         HW     H3AB     H3A  OSR/2SS/SCC  25.775322
    16       3582         SH     SH3B    SH3B  OSR/2SS/SCC  11.026199
    17       3581         HS     HS3B    HS3B  OSR/2SS/SCC  16.634195
    18       3589         HW     H3AB     H3A  OSR/2SS/SCC  54.684222
    19       3579         SH     SH3B    SH3B  OSR/2SS/SCC  17.313354
    20       3578         HW  H4C_H2C     H4C  OSR/2SS/SCC  30.255013
    21       3576         HW      H3C     H3C  OSR/2SS/SCC  11.310230
    22       3573         HW     H3AB     H3A  OSR/2SS/SCC  30.369559
    23       3575         HW  H4C_H2C     H4C  OSR/2SS/SCC  53.088547
    24       3569         HW      H4A     H4A  OSR/2SS/SCC  12.809001
    25       3567         HW      H4B     H4B  OSR/2SS/SCC  24.026682
    26       3568         HW     H3AB     H3B  OSR/2SS/SCC  57.934207
    27       3565         HW      H4B     H4B  OSR/2SS/SCC  33.545768
    28       3605         HW     H3AB     H3B  OSR/2SS/SCC  74.424945
    29       3580         HS     HS3B    HS3B  OSR/2SS/SCC   8.062028
    30       3571         HW     H3AB     H3A  OSR/2SS/SCC  30.718121
    31       3562         HW     H3AB     H3B  OSR/2SS/SCC  22.774026
    32       3110         SW      S3C     S3C  OSR/2SS/SCC   2.240600
    33     3120.1         SH     SH3B    SH3B  OSR/2SS/SCC   3.726728
    
    • It_is_Chris
      It_is_Chris over 5 years
      can you do something like this: df.loc[('', 'TOTAL','','',''), :] = df['ACRES'].sum()
    • ALollz
      ALollz over 5 years
      Have a look at the solutions to Pivot table subtotals in Pandas
    • Clickinaway
      Clickinaway over 5 years
      @ALollz I've actually been trying to make that work for the last 30 minutes or so but no luck.
    • Clickinaway
      Clickinaway over 5 years
      @Chris KeyError: "['' 'TOTAL' '' '' ''] not in index"
    • It_is_Chris
      It_is_Chris over 5 years
      @Clickinaway I am not getting an error on my test data you should be able to do df.loc[('', 'Grand Total','','',''), :] = df[df.index.get_level_values(1) != 'TOTAL'].sum()
    • Parfait
      Parfait over 5 years
      Please set up a reproducible example including all import lines and data we can run in our empty Python environments.
    • Clickinaway
      Clickinaway over 5 years
      @Parfait I will do with the caveat that I am utilizing the arcgis python module which simply converts a table (in this case an SQL table) to a numpy array.
    • Clickinaway
      Clickinaway over 5 years
      @Chris "IndexError: Too many levels: Index has only 1 level, not 2" I can try to add another level to my index?
    • It_is_Chris
      It_is_Chris over 5 years
      @Clickinaway what is the output of df.index your image shows a multiindex of 5 levels. Can you copy and paste your dataframe and not use an image so it can be copied?
    • Clickinaway
      Clickinaway over 5 years
      @Chris RangeIndex(start=0, stop=34, step=1)
    • Clickinaway
      Clickinaway over 5 years
      @Chris I think there are only 4 values in my pivot's multiindex
    • It_is_Chris
      It_is_Chris over 5 years
      @Clickinaway in my example it should be run on your final df (the pivoted df) as is report and not the variable df
    • Clickinaway
      Clickinaway over 5 years
      @Chris that makes more sense. The output of report.index is quite large, is there something i can verify or should i just paste all of it?
    • Clickinaway
      Clickinaway over 5 years
      @Chris, wait...I got your example to work. It does generate the grand total. Do I then try to append it to my original 'report'? ChrisSum = report.loc[('', 'Grand Total','','',''), :] = report[report.index.get_level_values(1) != 'TOTAL'].sum() This returns: ACRES 813.648841 dtype: float64
    • Clickinaway
      Clickinaway over 5 years
      @Parfait I've spent a while reviewing the reproducible example link you sent...I'm really not sure how to get around the multiindex issue! I'm very sorry.
    • Clickinaway
      Clickinaway over 5 years
      @Parfait thank you so much for reformatting!
    • It_is_Chris
      It_is_Chris over 5 years
      @Clickinaway no need to do anything just do report.loc[('', 'Grand Total','','',''), :] = report[report.index.get_level_values(1) != 'TOTAL'].sum() and then call print(report) but it looks like you are in jupyter so just call report and your last row should have the sum
    • Clickinaway
      Clickinaway over 5 years
      @Chris that worked!!! I will happily mark your answer as correct, thanks so much!
  • Kundan
    Kundan over 2 years
    How can I add acres %, this will be % of acres value from the parent total (not the grand total) ?