Filtering and selecting from pivot tables made with python pandas

python indexing pivot pivot-table pandas

42,497

Solution 1

Pivot table returns a DataFrame so you can simply filter by doing:

In [15]: pivoted = pivot_table(df, values='D', rows=['A', 'B'], cols=['C'])

In [16]: pivoted[pivoted.foo < 0]
Out[16]: 
C             bar       foo
A     B                    
one   A -0.412628 -1.062175
three B       NaN -0.562207
two   A       NaN -0.007245

You can use something like

pivoted.ix['one']

to select all A series groups

pivoted.ix['one', 'A']

to select distinct A and B series groups

Solution 2

Just to add info at the previous answer. When you try to use pivoted.ix['one'] in python3 you get the following message:

/usr/lib/python3.7/site-packages/ipykernel_launcher.py:7: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing

See the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated import sys

So, in this version of python, use the following code instead:

pivoted.loc['one']

42,497

Author by

alexhli

Updated on July 09, 2022

Comments

alexhli almost 2 years

I'm struggling with hierarchical indexes in the Python pandas package. Specifically I don't understand how to filter and compare data in rows after it has been pivoted.

Here is the example table from the documentation:

import pandas as pd
import numpy as np

In [1027]: df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 6,
                              'B' : ['A', 'B', 'C'] * 8,
                              'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 4,
                              'D' : np.random.randn(24),
                              'E' : np.random.randn(24)})

In [1029]: pd.pivot_table(df, values='D', rows=['A', 'B'], cols=['C'])
Out[1029]: 
    C             bar       foo
    A     B                    
    one   A -1.154627 -0.243234
          B -1.320253 -0.633158
          C  1.188862  0.377300
    three A -1.327977       NaN
          B       NaN -0.079051
          C -0.832506       NaN
    two   A       NaN -0.128534
          B  0.835120       NaN
          C       NaN  0.838040

I would like to analyze as follows:

1) Filter this table on column attributes, for example selecting rows with negative foo:

    C             bar       foo
    A     B                    
    one   A -1.154627 -0.243234
          B -1.320253 -0.633158
    three B       NaN -0.079051
    two   A       NaN -0.128534

2) Compare the remaining B series values between the distinct A series groups? I am not sure how to access this information: {'one':['A','B'], 'two':['A'], 'three':['B']} and determine which series B values are unique to each key, or seen in multiple key groups, etc

Is there a way to do this directly within the pivot table structure, or do I need to convert this back in to a pandas dataframe?

Update: I think this code is a step in the right direction. It at least lets me access individual values within this table, but I am still hard-coding the series vales:

table = pivot_table(df, values='D', rows=['A', 'B'], cols=['C'])
table.ix['one', 'A']

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

pandas pivot_table keep index

Pandas NaN introduced by pivot_table

How to remove multilevel index in pandas pivot table

pandas pivot table rename columns

Accessing Another Column By Value ,Pandas

Error iterating through a Pandas series

How does one append large amounts of data to a Pandas HDFStore and get a natural unique index?

Label based indexing Pandas (.loc)

Pandas - how to convert RangeIndex into DateTimeIndex

Python - Pandas subtotals on groupby