Filter DataFrame based on Max value in Column - Pandas

17,775

You could groupby Browser and take the max:

In [11]: g = df.groupby('Browser')

In [12]: g['Metric1'].max()
Out[12]:
Browser
Chrome/29    3000
FF           2000
IE           1000
Opera        3000
Name: Metric1, dtype: int64

In [13]: over2000 = g['Metric1'].max() > 2000

In [14]: over2000
Out[14]:
Browser
Chrome/29     True
FF           False
IE           False
Opera         True
Name: Metric1, dtype: bool

To get out the array, use this as a boolean mask:

In [15]: over2000[over2000].index.values
Out[15]: array(['Chrome/29', 'Opera'], dtype=object)
Share:
17,775

Related videos on Youtube

DJElbow
Author by

DJElbow

Updated on May 26, 2022

Comments

  • DJElbow
    DJElbow almost 2 years

    Using pandas, I have a DataFrame that looks like this:

    Hour            Browser     Metric1   Metric2   Metric3
    2013-08-18 00   IE          1000      500       3000
    2013-08-19 00   FF          2000      250       6000
    2013-08-20 00   Opera       3000      450       9000
    2001-03-21 00   Chrome/29   3000      450       9000
    2013-08-21 00   Chrome/29   3000      450       9000
    2014-01-22 00   Chrome/29   3000      750       9000
    

    I want to create an array of browsers which have a maximum value of Metric1 > 2000. Is there a best way to do this? You can see basically what I am trying to do with the code below.

    browsers = df[df.Metric1.max() > 2000]['Browser'].unique()